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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

In re Application of: : PCT Date: 11/9/99 

F. BORDON- PALLIER et al : 

PCT No. : PCT/FR99/02738 : 

Filed: Concurrently Herewith 

For: HUMAN. . .PROTEIN : 

600 Third Avenue 
New York N.Y. 10016 

PRELIMINARY AMENDMENT 

Asst. Commissioner for Patents 
Washington, D.C. 20231 

Sir: 

Please amend this application as follows: 

IN THE SPECIFICATION : 

Page 1, before line 1, insert 

--This application is a 371 of PCT/FR99/02738 filed November 
9, 1999. -- 

IN THE CLAIMS : 

Claim 3 (amended) DNA sequence of the htfUIA gene according 
to claim 1 containing the nucleotide sequence SEQ ID No : 3. 

Claim 4 (amended) DNA sequence of the htfUIA gene according 
to claim 1 containing the nucleotide sequence SEQ ID No: 4. 

Claim 5 (amended) DNA sequence according to claim 4 having 
the sequence beginning at nucleotide 176 and finishing at the 
nucleotide 1270 of SEQ ID No: 3. 

Claim 6 (amended) DNA sequence coding for the human 
transcription factor hTFIIIA according to claim 1 as well as the 



DNA sequence which hybridize with it and/or show a significant 
homology with this sequence or fragments of it and which code for 
a protein with the same function. 

Claim 7 (amended) DNA sequence according to claim 1 
comprising modifications introduced by suppression, insertion 
and/or substitution of at least one nucleotide coding for a protein 
with the same biological activity as human transcription factor 
hTFIIIA. 

Claim 8 (amended) DNA sequence according to claim 1 as well 
as similar DNA sequences which have nucleotide sequence homology of 
at least 50% or at least 60% and preferably at least 70% with the 
said DNA sequence. 

Claim 9 (amended) DNA sequence according to claim 1 as well 
as similar DNA sequences which code for a protein, the AA sequence 
of which has a homology of at least 4 0% and in particular 45% or at 
least 50%, rather at least 60% and preferably at least 70% with the 
AA sequence coded by the said DNA sequence. 

Claim 10 (amended) Polypeptide having the function of human 
transcription factor hTFIIIA and with the amino acid sequence SEQ 
ID No: 2 coded by the DNA sequence according to claim 1 and the 
analogues of this polypeptide. 

Claim 11 (amended) Process for the preparation of the hTFIIIA 
recombinant protein having the amino acid sequence SEQ ID No: 2 
comprising the expression of the DNA sequence according to claim 1 
in an appropriate host, then isolation and purification of the said 
recombinant protein. 
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Claim 12 (amended) Expression vector containing the DNA 
sequence according to claim 3 . 

Cancel claims 15 and 16 and add the following claims-. 

--17. A method of treating a disease linked to transcription 
control disorders in warm-blooded animals comprising administering 
to warm-blooded animals in need thereof an amount of the DNA 
sequence of claim 1 or the human transcription factor coded by the 
sequence sufficient to treat said diseases. 

18. The method of claim 17 wherein the disease is cancer. -- 

REMARKS 

The amendment is presented to insert reference to the PCT 
application, to remove multiple dependency from the claims and to 
present proper method of use claims . A marked up copy of the 
amended claims is submitted herewith. 

Respectfully submitted, 
Bierman, Muserlian and Lucas 

By: ntJljZ 

Charles A. Muserlian #19,683 
Attorney for Applicants 
Tel.# (212) 661-8000 
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CLAIMS 

1) DNA sequence of the htfUIA gene coding for a protein 
having the biological function of human transcription factor 
hTFIIIA. 

5 2) DNA sequence of the htfUIA gene of the human 

transcription factor hTFIIIA according to claim 1, coding for 
the amino acid sequence SEQ ID N°2. 

3) DNA sequence of the htfUIA gene according to claim 1 
containing the nucleotide sequence SEQ ID N°3 

10 4) DNA sequence of the htfUIA gene according to claim^ 1 4s«f 
^ containing the nucleotide sequence SEQ ID N°4. 
5) DNA sequence according to claim 4 having the sequence 
beginning at nucleotide 176 and finishing at the nucleotide 
1270 of SEQ ID N°3. 

15 6) DNA sequence coding for the human transcription factor 
hTFIIIA according to claim| 1 Xs^T as well as the DNA 
sequences which hybridize with it and/or show a significant 
homology with this sequence or fragments of it and which code 
for a protein with the same function. 

20 7) DNA sequence according to claim^ 1 _£e— 6" comprising 

modifications introduced by suppression, insertion and/or 
substitution of at least one nucleotide coding for a protein 
with the same biological activity as human transcription 
factor hTFIIIA. 

25 8) DNA sequence according to a&e - ef claimg? 1 as well as 

similar DNA sequences which have nucleotide sequence homology 
of at least 50 % or at least 60 % and preferably at least 70 
% with the said DNA sequence. 

9) DNA sequence according to e&e-e-f claim^ 1 £©— 8- as well as 
30 similar DNA sequences which code for a protein, the AA 
sequence of which has a homology of at least 40 % and in 
particular 45 % or at least 50 %, rather at least 60 % and 
preferably at least 70 % with the AA sequence coded by the 
said DNA sequence. 
35 10) Polypeptide having the function of human transcription 
factor hTFIIIA and with the amino acid sequence SEQ ID N°2 
coded by the DNA sequence according to aae— claim^r 1 to & 
and the analogues of this polypeptide. 
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11) Process for the preparation of the hTFIIIA recombinant 
protein having the amino acid sequence SEQ ID N°2 comprising 
the expression of the DNA sequence according to o &e - of claim^ 
1 tre^-S^in an appropriate host, then isolation and purification 

5 of the said recombinant protein. 

12) Expression vector containing the DNA sequence according 
to one - ef claim^ 3 to 9 -r 

13) Host cell transformed with a vector according to claim 12 

14) Plasmid deposited at the CNCM under the number 1-2071. 
10 15) Use of the human transcription factor htfUIA gene or of 

the human transcription factor coded by this gene according 
to one of the claims 1 to 10 for the preparation of 
compositions which can be used for the diagnosis or treatment 
of diseases linked to a disorder in transcription control. 
15 16) Use according to claim 15 for which the disease concerned 
is cancer. 
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Human htFIIIA gene and coded htfUIA protein 

The present invention relates to the gene coding for the 
human transcription factor hereafter called htFIIIA (or 
htfC2) gene and the coded htfUIA protein, as well as the use 
of this htFIIIA gene and that of the coded htfUIA protein in 
/the diagnosis and identification of certain diseases related 
to the transcription mechanism. 

Hereafter the gene coding for the transcription factor 
TFIIIA will be called tfHIA (or tfC2) and the gene coding 
for the human transcription factor htfUIA will be called 
htfUIA. 

The human htFIIIA gene codes therefore for the corresponding 
htFIIIA protein. 

We will also use the following abbreviations below: AA 
for amino acids, NA for nucleic acids, bp for base pairs, DNA 
for deoxyribonucleic acid, cDNA for complementary DNA, RNA 
for ribonucleic acid, RNase for ribonuclease and C for 
deoxycytidine . 

The term screening which indicates a specific screening 
technique and the term primer which indicates an 
oligonecleotide used as primer will also be used. 
The tfHIA gene and the corresponding tfHIA protein are 
involved in the regulation of the biological transcription 
mechanism as indicated below. 

Since the tfHIA protein was purified as transcription 
factor for the first time in 1980 from Xenopus ovocytes 
[Segall et al, J. Biol. Chem. , 255, 11986-11991 (1980)], work 
has been carried out in vivo and in vitro within the Xenopus 
to study the mechanism of transcription control exercised by 
TFIIIA. It has thus been shown that Xenopus TFIIIA is 
necessary for the initiation of the transcription of 5S RNA 
gene [Sakonji et al, Cell 19, 13-25 (1980)] and binds to a 
internal control region of the 5S RNA gene [Bogenhagen et al, 
Cell, 19,27-35 (1980)]. 

The nucleotide sequence of the cDNA of Xenopus TFIIIA 
and the corresponding amino acid sequence have already been 
published [Ginberg e t al, Cell 39,479-489 (1984)]. It can be 



2 



noted that this gene codes for a structure of 9 zinc fingers, 
a zinc finger corresponding to the repetition of the CYS2 
HIS2 (C2H2) moiety. This zinc finger structure is considered 
an essential domain for a group of proteins which bind 
5 themselves to the DNA (DNA binding proteins) [Miller et al, 
Embo J., 4, 1607-1614 (1985)]. 

In this way transcription factors in human beings, 
binding to the DNA which also have this zinc finger structure 
such as for example XT1 of the Wilms human tumor gene, 
10 [Gessier et al, Nature, 343, 774-778,' (1990)], the YY1 human 
transcription repressor [SHI et al, Cell, 67, 377-388 

(1991) ] , the MAZ protein combined with the human MYC gene 
[Bossone et al, Proc. Natl. Acad. Sci., USA, 89, 7452-7456 

(1992) ] or also spl [Kuwahara et al, J.Biol. Chem. , 2_9, 
15 8627-8631 (1990)] are known. 

Studies have been carried out in order to isolate the 
human htFIIIA gene, but until now none have led to discovery 
of the true sequence of the htFIIIA gene. 

On one hand the studies described in the European 

20 Application EP 0704526 (Fujisawa et al), can thus be 

mentioned and are examined in the article: Arakawa et al 
(1995), Cytogenet Cell Genet 70, 235-238, which have led to a 
sequence that we will call Arakawa htfUIA and on the other 
hand the studies described in the article: DREW et al (1995), 

25 Gene 159, 215-218, which have led to a sequence that we will 
call DREW htfUIA. These DREW and ARAKAWA htfUIA sequences 
are represented in Figures 4 and 5 respectively below. 
The documents indicated above therefore each describe a 
sequence of the htfUIA gene but these two sequences differ 

30 from one another by a few nucleotides and differ from the 
htfUIA gene of the present Application as indicated below. 

The present invention has made it possible to isolate 
the gene coding for the human transcription factor hTFIIIA. 
The present invention has also made it possible to 

35 reveal the nucleic acid sequence of the htfUIA gene and also 
the amino acid sequence of the hTFIIIA protein coded by this 
gene . 

Therefore a subject of the present invention is the DNA 
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sequence of the htfUIA gene coding for a protein having the 
biological function of human transcription factor htfUIA. 

A precise subject of the present invention is the DNA 
sequence of the htfUIA gene of human transcription factor 
hTFIIIA as defined above, coding for the amino acid sequence 
SEQ ID N°2. 

Such a SEQ ID n°2 sequence of the present invention 
therefore comprises 365 amino acids. 

A subject of the present invention is also the DNA 
sequence of the htfUIA gene as defined above, containing the 
nucleotide sequence SEQ ID N°3. 

A subject of the present invention is the DNA sequence 
of the htfUIA gene as defined above, containing the 
nucleotide sequence SEQ ID N°4. 
15 A subject of the present invention is also the DNA 

sequence of the htfUIA gene as defined above, corresponding 
to the nucleotide sequence SEQ ID N°3. 

The sequence SEQ ID N°3 therefore comprises 1273 nucleotides. 
A particular subject of the present invention is the DNA 
20 sequence of the htfUIA gene as defined above, corresponding 
to the nucleotide sequence SEQ ID N°4. The sequence SEQ ID 
N°4 therefore comprises 1213 nucleotides. 

The sequence SEQ ID N°l represents the nucleotide sequence of 
the htFIIIA gene on the upper line according to the present 
25 invention i.e. SEQ ID N°3, and the corresponding amino acid 
sequence (AA) of this nucleotide sequence i.e. SEQ ID N°2 on 
the lower line. 

Figures 1 and 2 below represent the AA sequence coded by 
htfUIA of the present invention SEQ ID N°2 on the upper 
30 line, and the AA sequences coded by the DREW htfUIA genes, 
in Figure 1, and ARAKAWA genes in Figure 2 on the lower line 
respectively, these DREW and ARAKAWA sequences are as 
published in the documents referred to above. 

Figure 3 below represents the comparison of AA sequences 
35 coded by the DREW and ARAKAWA htfUIA genes respectively with 
the AA sequence coded by Arakawa htfUIA on the upper line 
and the AA sequence coded by DREW htfUIA on the lower line. 
Figure 2 therefore shows, that the corresponding AA 
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sequence of htfUIA according to the present invention 
comprises differences from the AA sequence published in the 
ARAKAWA article or EP 0704 526, in particular in the 
corresponding positions 105 and 163, 156 and 214, 320 to 329 
5 and 378 to 387 respectively, these positions being given in 
relation to the numbering indicated in Figure 2. 

Figure 2 also shows that the AA sequence coded by 
htfUIA of the present invention begins at position 59 of the 
AA sequence of Arakawa htfUIA. 

10 Figure 3 shows that the AA sequences coded by Arakawa 

and DREW htfUIA comprise differences at the corresponding 
positions 214 and 154, 378-387 and 318-327 respectively, 
these positions being given in relation to the numbering 
indicated in Figure 3. 

15 Figure 5 shows that the Arakawa htfUIA sequence codes 

for a protein, the amino acid sequence of which, indicated in 
EP 0704 526, begins with the AA methionine specified by the 
ATG codon which is found in position 20-22 and the 
translation stops at a TAA codon. If the nucleotide sequence 

20 of htfUIA according to the present invention SEQ ID N°3 is 

compared with the nucleotide sequence of EP 0704 526 i.e. 

Arakawa htfUIA shown in Figure 5 (sequence pll-12-13 of EP 

0704 526), it can be noted that it lacks a C nucleotide in 

position 127 of the EP 0704 526 sequence. This additional C 

25 nucleotide results in a shift in the translation of amino 

t 

acids of this nucleotide sequence: in fact, the ATG which is 
found in position 20-22 of the ARAKAWA sequence shown in 
Figure 5 and which is considered to be a start codon of 
proteinic synthesis by ARAKAWA, is therefore no longer in the 

30 same reading frame because of this shift. By taking into 

consideration this additional C nucleotide, the translation 
of AA reveals a TGA stop codon in position 57-59 of the 
ARAKAWA sequence shown in Figure 5. Consequently, the start 
codon of proteinic synthesis according to the present 

35 invention is located downstream of this stop codon. 

Translation experiments in vitro of SEQ ID N°4 and expression 
tests in mammalian cells such as Cos cells have made it 
possible to identify the start codon of the proteinic 
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synthesis of hTFIIIA according to the present invention. 

This start codon of proteinic synthesis of hTFIIIA 
according to the present invention is the CTG codon in 
position 176-178 of SEQ ID N°3 (which would correspond to 
5 position 194-196 of the ARAKAWA sequence shown in Figure 5) . 

The coding section of the htFIIIA gene of the present 
invention begins therefore with this CTG codon which is found 
in position 176-178 of SEQ ID N°3 which should correspond to 
the AA Leucine and which in fact corresponds to the AA 

10 Methionine as this codon is recognised as a start codon (ref: 
David S. Peabody The Journal of Biological Chemistry, vol. 
264, n°9, pp. 5031-5035, 1989). 

Consequently, as Figure 2 shows, the ARAKAWA hTFIIIA 
protein is longer than the hTFIIIA protein of the present 

15 invention. 

Furthermore, if the hTFIIIA protein of the present 
invention and the DREW hTFIIIA protein are compared 
(comparison shown in Figure 1) , it is noticed that the amino 
acid threonine in position 105 of the hTFIIIA protein of the 

20 present invention corresponds to an asparagine residue in 
position 103 in the DREW hTfUIA sequence and that the two 
first AA, M and D of the hTFIIIA protein of the present 
invention have not been determined for the DREW hTFIIIA 
protein. The absence of codons specifying these AA and in 

25 particular the absence of the start codon of proteinic ( 

synthesis, does not permit the expression of this protein. 
The DREW htfUIA sequence shown in Figure 4 is therefore 
incomplete, and this is recognised by the authors of the 
publication referred to above (DREW et al on page 216 lines 

30 39-41). It can be noted moreover, that the authors of this 
article also think that the start codon of the DREW htfUIA 
sequence should correspond to a methionine coded by ATG as in 
the ARAKAWA sequence. 

The htfUIA gene according to the present invention is 

35 therefore different from the DREW and ARAKAWA htfUIA genes 

(EP 0704526) and codes for a hTFIIIA protein, the AA sequence 
of which is different from that of the DREW and ARAKAWA 
hTFIIIA proteins. 
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Therefore a particular subject of the present invention 
is the DNA sequence of the htfUIA gene as defined above 
containing the nucleotide sequence SEQ ID N°3. 

A more particular subject of the present invention is 
5 the DNA sequence as defined above having the sequence 

beginning at nucleotide 176 and finishing at nucleotide 1270 
of SEQ ID N°3. 

One such sequence of the present invention therefore 
begins at a CTG codon and thus comprises 1095 nucleotides. 
10 A subject of the present invention is the DNA sequence 

coding for the human transcription factor hTFIIIA as defined 
above as well as the DNA sequences which hybridize with it 
and/or show a significant homology with this sequence or 
fragments of it and coding for proteins having the same 
15 function. 

By sequences which hybridize are included DNA sequences which 

hybridize with one of the DNA sequences above under standard 

conditions of high, medium or low stringency. By proteins 

with the same function are included polypeptides with the 

20 same transcription factor function. The stringency 

conditions are those carried out in conditions known to a 

person skilled in the art, such as those described by 

Sambrook et al (1989) Molecular cloning, Cold Spring Harbor 

Laboratory Press, 1989. Such stringency conditions are for 

25 example hybridization at 65°C, for 18 hours in a 5 x SSPE; 10 

t 

x Denhardt's; lOOug/ml ssDNA; 1 % SDS solution followed by 
washing 3 times for 5 minutes with 2 x SSC; 0.05 % SDS, then 
washing 3 times for 15 minutes at 65 °C in 1 x SSC; 0.1 % SDS. 
The high stringency conditions for example include 

30 ' hybridization for 18 hours at 65°C in a 5 x SSPE; 10 x 

Denhardt's; lOOug/ml ssDNA; 1 % SDS solution, followed by 
washing twice for 20 minutes with a 2 x SSC; 0.05 % SDS 
solution at 65 °C followed by a final wash for 45 minutes in a 
0.1 x SSC; 0.1 % SDS solution at 65°C. Medium stringency 

35 conditions for example include a final washing for 20 minutes 
in a 0.2 x SSC, 0.1 % SDS solution at 65°C. 

By sequences which show a significant homology are included 
sequences with a nucleotide sequence with a similarity of at 
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least 50 % with one of the DNA sequences above and which 
codes for a protein having the same transcription factor 
function . 

A subject of the present invention is also the DNA 
sequence as defined above comprising modifications introduced 
by suppression, insertion and/or substitution of at least one 
nucleotide coding for a protein having the same biological 
activity as the human transcription factor htflllA. 

A particular subject of the present invention is the DNA 
sequence as defined above as well as similar DNA sequences 
which have a nucleotide sequence homology of at least 50 % or 
at least 60 % and preferably at least 70 % with the said DNA 
sequence . 

Therefore a subject of the present invention is also the 
DNA sequence as defined above as well as the DNA sequences 
which code for a protein, the AA sequence of which has a 
homology of at least 40 % and in particular of 45 % or at 
least 50 %, rather at least 60 % and preferably at least 70 % 
with the AA sequence coded by the said DNA sequence. 

The gene of the present invention is represented as a 
single strand DNA sequence but it is understood that the 
present invention includes the complementary DNA sequence of 
this single strand DNA sequence, and also includes the so- 
called double strand DNA sequence constituted by these two 
DNA sequences complementary to each other. 

The DNA sequence of the present invention is an example 
of the combination of codons coding for the amino acids 
corresponding to the amino acid sequence SEQ ID N°2, but it 
is also understood that the present invention includes any 
other arbitrary combination of codons coding for this same 
amino acid sequence SEQ ID N°2. 

The DNA sequence as defined above or this modified DNA 
sequence as indicated above, can be prepared according to 
techniques known to a person skilled in the art and in 
particular those described in the book by Sambrook, J. 
Fritsh, E. F. § Maniatis, T. (1989) entitled: " Molecular 
cloning: a laboratory manual ", Laboratory, Cold Spring 
Harbor NY. In particular the DNA sequence above can be a 
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cDNA sequence obtained by identification of the 3' and 5' 
parts of the coding sequence, then amplification of these 
parts using a DNA polymerase such as pfu polymerase or other 
DNA polymerases. The introduction, into the oligonucleotide 
5 sequence used for PCR, of restriction sites such as Hind III 
or Smal allow the cloning of these fragments in appropriate 
vectors and then the restoration of the sought complete 
sequence. A detailed description of the operating conditions 
in which the present invention was carried out is given 
10 below. 

A quite particular subject of the invention is the 
polypeptide having the function of human transcription factor 
hTFIIIA and having the amino acid sequence SEQ ID N°2 coded 
by the DNA sequence as defined above and the analogues of 
15 this polypeptide. 

By analogues is understood the polypeptides the amino 
acid sequence of which has been modified by substitution, 
suppression or addition of one or more amino acids but which 
retain the same biological function. Such polypeptide 
analogues can be produced spontaneously or can be produced by 
post-transcriptional modification or also by modification of 
the DNA sequence of the present invention as indicated above, 
by using techniques known to a person skilled in the art: 
amongst these techniques, the directed mutagenesis technique 
(Kramer, W. , et al., Nucl. Acids Res., 12, 9441 (1984); 
Kramer, W. and Fritz, H.J., Methods in Enzymology, 154^_, 350 
(1987); Zoller, M.J. and Smith, M. Methods in Enzymology, 
100.468 (1983)) can in particular be mentioned. 

Modified DNA synthesis can also be carried out by using 
well-known chemical synthesis techniques such as the 
phosphotriester method for example [Letsinger, R.L and 
Ogilvie, K.K., K. Am. CHEM. Soc, 91.3350 (1969); Merrifield, 
R.B., Sciences, 150, 178 (1968)] or the phosphoamidite method 
[Beaucage, S.L and Caruthers, M . H . , Tetrahedron Lett., 22, 
35 1859 (1981); McBRIDE, L.J. and Caruthers, M.H. Tetrahedron 
Lett., 24 245 (1983) J or also by the combination of these 
methods . 

The polypeptides of the present invention can therefore 
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be prepared by techniques known to a person skilled in the 
art, in particular partially by chemical synthesis or also by 
cDNA synthesis by expression in a procaryotic or eucaryotic 
host cell as indicated below. 
5 A particular subject of the present invention is the 

process for the preparation of the recombinant htFIIIA 
protein having the amino acid sequence SEQ ID N°2. This 
process includes the expression of the DNA sequence as 
defined above in an appropriate host, then isolation and 

10 purification of the said recombinant protein. 

To produce the polypeptide of the present invention, 
recombinant DNA techniques using genetic engineering and cell 
culture methods known to a person skilled in the art can in 
particular be used. The following stages can therefore be 

15 carried out: firstly preparation of the appropriate gene, 
then incorporation of this gene into a vector, transfer of 
the gene carrier vector into an appropriate host cell, 
expression of the gene and finally purification of the 
protein coded by this gene. 

The DNA sequences according to the present invention and in 
particular SEQ ID N°3 or SEQ ID N°4 can be prepared according 
to techniques known to a person skilled in the art, in 
particular by chemical synthesis, by screening of a gene bank 
or a cDNA bank using oligonucleotide synthesis probes using 
25 known hybridization techniques or also by reverse 
transcriptase from messenger RNA (mRNA) . 

The advantage of the technique comprising firstly the 
isolation of mRNA by extraction of the total RNA then the 
synthesis of cDNA from this mRNA by reverse transcriptase 
30 particularly rests on the fact that the mRNA does not contain 
introns even though these non-coding sequences are present in 
the genomic DNA. 

The following procedure can in particular be carried out. 
Firstly the total RNA originating from a cell line such as 
35 for example the Raji cell line (RNA Plus, BIOPROBE) is 

extracted, and from this RNA, synthesis of the sought cDNA is 
then carried out, in particular by using a kit such as the 
RNA PCR kit (Perkin Elmer) . 
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It can be noted that within the scope of the present 
invention, two oligonucleotides located at the ends of the 
htfUIA coding sequence published by ARAKAWA (Figure 5) were 
synthesized i.e. OLT5 and 0LT3 and are defined as follows: 

- OLT5: 5' CGGGGTACCAAAA ATG CGC AGC AGC GGC GCC GAC 3' i.e. 
SEQ ID N°5 and 

- OLT3: 5' CGGTCTAGA TTA GCC AAG GGT AAG TAC TGC 3' i.e. SEQ 
ID N°9 

but these two oligonucleotides have not made it possible to 
obtain an amplification product by PCR. 

Thus, within the scope of the present invention, the 
hTFIIIA coding sequence was isolated in two stages: firstly 
identification of the 3' part then identification of the 5' 
part . 

After identification of the 3' and 5' parts, a Hindlll 
restriction site located on each of these fragments then made 
it possible to restore the complete sought sequence as 
indicated below in the experimental part. 
The following process was then carried out: 

The 3' part was amplified using pfu polymerase (STRATAGENE) 
using the OLT5.2 and TFIIIA 3'SmaI oligonucleotides as primer 
i.e.: 

- OLT5.2: 5 ' TCCTTCCCTGACTGCAGCGCC 3' or SEQ ID N° 6 and 

- TFIIIA3' Smal : 5'CCT CCC GGG GCC AAG GGT AAG TAC TGC AAC 3' 
or SEQ ID N°10 

The amplification primers are chosen as a function of the 
part to be amplified according to the usual criteria of a 
person skilled in the art. 

The primers used in the present invention were chosen in the 

Arakawa htfUIA sequence shown in Figure 5. 

The sequences SEQ ID N°6, SEQ ID N°7 and SEQ ID N°8 are 

located in positions 320-340 (5'->3'), 361-380 (reverse and 

complementary sequence) and 391-410 (reverse and 

complementary sequence) respectively of this Arakawa htfUIA 

sequence. 

The sequences SEQ ID N°5, SEQ ID N°9 and SEQ ID N°10 are 
located in positions 20-40 (5'->3'), 1271-1291 (reverse and 
complementary sequence) and 1268-1288 (reverse and 
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complementary sequence) respectively of this Arakawa htfUIA 
sequence . 

It can be noted that sequences SEQ ID N°5, SEQ ID N°9 and SEQ 
ID N°10 contain sequences corresponding to the restriction 
5 enzyme sites i.e. Kpnl, Xbal and Smal respectively. 

The oligonucleotide TFIIIA 3' Smal introduces a 
restriction site Smal downstream of the coding sequence. 
This site permits, if necessary and if required, the fusion 
of the coding sequence for hTFIIIA with a coding sequence for 
10 a hemaglutinin epitope peptide designated " TAG HA ". The 

expression of the coding sequence for TFIIIA can therefore be 
combined with that of the coding sequence for TAG HA which 
can be detected by Western blot analysis, if the fusion gene 
is expressed. 

15 For identification of the 5' part, this region was 

isolated by the 5' anchored PCR (5 race System, GIBCO BRL; 
pfu polymerase, STRATAGENE) technique. Two successive PCR ' s 
were carried out using the following oligonucleotides as 
primer: UAP and TFIIIAPCR5' for the first PCR and UAP and 

20 TFIIIA SEQ2 for the second PCR. 

UAP is an oligonucleotide provided in the kit. 

These oligonucleotides have the following sequences: 

- TFIIIAPCR5 ' : 5' CACAAACAAATGGTCTCC 3' or SEQ ID N°8 

- TFIIIA SEQ2: 5' TGCACAGGTGCGCGTCAAGC 3' or SEQ ID N°7. 
The products of these PCR ' s i.e. the amplified 5' and 3' 

fragments are then purified on agarose gel and cloned using 
the TA cloning kit ( INVITROGEN) . Sequencing is then carried 
out: the plasmid DNA of several independent clones is 
prepared (QIAGEN Plasmids KIT) and the fragments 
corresponding to the coding sequence of hTFIIIA are sequenced 
on the two strands (ABI 377XL sequencer, PERKIN ELMER) . 

The following process can then be carried out according 
to usual cloning techniques known to a person skilled in the 
art and in particular cloning by insertion of each fragment 
35 into a plasmid provided with the commercial kit (TA cloning 

Kit Invitrogen) , then transformation of a bacterial strain by 
the plasmid thus obtained is then carried out. The XL1 Blue 
E. coli strain can in particular be used. 
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The clones are then cultured in order to extract the 
plasmid DNA according to standard techniques known to a 
person skilled in the art referred to above (Sambrook, Fritsh 
and Maniatis) . 

Sequencing of the DNA of the amplified fragment 
contained in the plasmid DNA is carried out. 

The compilation of the sequence data thus obtained 
reveals that in 3', the main part of the isolated sequence 
corresponds to the htfUIA sequence of DREW et al. 
In 5', the longer sequence starts in position 80 of the 
htfUIA sequence of Arakawa et al., shown in Figure 2F, and 
reveals the insertion of a C nucleotide in position 127 in 
relation to this sequence. If it can be supposed that the 
synthesis of the cDNA in the application of the technique 
described above is not complete, the insertion of a 
nucleotide nevertheless creates a major problem. In fact, 
the addition of a nucleotide in the coding sequence creates a 
shift in the reading frame. In order to verify the presence 
of this nucleotide in the htFIIIA gene, human genomic DNA was 
analysed by PCR. This DNA was subjected to a PCR reaction 
using pfu polymerase (STRATAGENE) or Taq polymerase (Perkin 
Elmer) using the oligonucleotides OLT5 and TFIIIA SEQ2 called 
SEQ ID N°5 and SEQ ID N°7 respectively as primer. The two 
PCR products were cloned (TA cloning Kit) then sequenced. 

Analysis of the sequence data confirms the presence of 
this additional C nucleotide in relation to the Arakawa 
htfUIA sequence for these two amplifications. The ATG 
initially described as start codon of proteinic synthesis for 
Arakawa htfUIA can therefore no longer be considered as 
such . 

The assembly of 5' and 3' sequences is then carried out 
and a unique plasmid containing the sought hTFIIIA sequence 
of the present invention is obtained. The complete hTFIIIA 
coding sequence is restored in the following manner. A clone 
originating from the amplification of the genomic DNA is 
digested using the restriction enzymes EcoRI and HindHI, and 
after purification, a fragment of approximately 350 bp is 
obtained. Furthermore, a clone originating from the 
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amplification of the 3' part using the restriction enzymes 
Hindi II and Smal is digested and after purification, a 
fragment of approximately 930 bp is obtained. 

The ligation of these fragments in the plasmid pYX223 
5 (expression vector for the yeast - R§D) previously digested 
by EcoRI and Smal is then carried out. 

A detailed account of the conditions under which the 
operations indicated above can be carried out is given below 
in the experimental part. A plasmid is thus obtained in 
10 which the gene of the present invention is inserted and this 
plasmid introduced into a host cell is also thus obtained by 
operating according to the usual techniques known to a person 
skilled in the art. 

The polypeptide of the present invention can be obtained 
15 by expression in a host cell containing the DNA sequence 
coding for the polypeptide of the invention preceded by a 
suitable promoter sequence. The host cell can be a 
procaryotic cell, for example E. coli or a eucaryotic cell 
such as yeasts, such as for example ascomycetes amongst which 
are Saccharomyces cerevisiae or also mammalian cells such as 
Cos. cells 

A particular subject of the present invention is an 
expression vector containing a DNA sequence as defined above. 
Thus, such an expression vector according to the present 
25 invention contains a DNA sequence which can be the nucleotide 
sequence SEQ ID N°3 or the sequence beginning at nucleotide 
176 and terminating at nucleotide 1270 of SEQ ID N°3. 
Such an expression vector according to the present invention 
can also contain the DNA sequences which hybridize with the 
sequences defined above, and/or show a significant homology 
with these sequences or fragments of them. 

Such an expression vector according to the present 
invention can also contain DNA sequences which comprise 
modifications introduced by suppression, insertion and/or 
35 substitution of at least one nucleotide coding for a protein 
with the same biological activity as the human transcription 
factor hTFIIIA. 

Expression vectors are vectors allowing the expression 
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of the protein under the control of an appropriate promoter. 
Such a vector can be a plasmid, a cosmid or viral DNA. For 
the procaryotic cells, the promoter can be for example the 
lac promoter, trp promoter, tac promoter, p-lactamase 
promoter or PL promoter. For yeast cells, the promoter can 
be for example PGK promoter or GAL promoter. For mammalian 
cells, the promoter can for example be SV40 promoter or 
adenovirus .promoters . Baculovirus type vectors can also be 
used for expression in insect cells. 

The host cells are for example procaryotic cells or 
eucaryotic cells. The procaryotic cells are for example E. 
coli, Bacillus or Streptomyces . The eucaryotic host cells 
comprise yeasts as well as cells of higher organisms, for 
example mammalian or insect cells. The mammalian cells are 
for example fibroblasts such as CHO or BHK hamster cells and. 
Cos monkey cells. The insect cells are for example SF9 
cells . 

The present invention therefore relates to a process 
which comprises the expression of the htFIIIA protein in a 
host cell transformed by a DNA coding for the polypeptide 
sequence corresponding to sequence SEQ ID N°2. 

For the implementation of the present invention, the 
vectors used can for example be pGEX or bpAD and the host 
cell can be E. coli or for example the vector pYX223, and the 
host cell can also be S. cerevisiae. 

i 

A particular subject of the present invention is a host 
cell transformed with a vector as defined above, containing 
the htfUIA gene according to the present invention. 

A very precise subject of the present invention is the 
plasmid deposited at the CNCM under the number 1-2071. 
It thus concerns the XLl-Blue/bpShtf c2LHA strain containing 
the htfUIA gene according to the present invention. 
The operating conditions in which the present invention was 
carried out are described below in the experimental part. 
The hTFIIIA protein coded by the htfUIA gene is therefore a 
transcription regulation factor. In fact, the hTFIIIA 
protein coded by the gene of the present invention has a 
biological role as a protein binding to the DNA and the 
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product of this gene is useful as transcription regulation 
factor . 

In particular, the gene of the present invention is expressed 
in different tissues and probably plays an important role in 
5 the initiation of the transcription of the 5S ribosomal RNA 
gene, and in maintaining the stability of the transcription 
of other genes in particular involved in control functions. 
A very large number of diseases accompanying a transcription 
control disorder have recently been brought to light. It has 

10 therefore been noted that certain oncogenic products act as 
transcription regulation factors and can lead to canceration 
of cells such as for example in certain leukaemias or also 
that production of the regulation factor Hox2-4 in too great 
a quantity induces leukaemia in mice. 

15 Furthermore, in some hereditary diseases, the protein 

concerned can in itself be normal, the pathogenicity results 
from the transcription mechanism of the gene coding for this 
protein. In particular, many hereditary diseases show an 
abnormality in the quantity of proteins synthesized which is 

20 probably due to a disorder in proteinic synthesis which can 
in particular bring into play the htfUIA gene and the coded 
protein as factors involved in the control of the 
transcription of 5S RNA. 

The gene of the present invention can thus be used for 

25 the research into abnormalities in the transcription of 

genes, and in particular in the identification of hereditary 
diseases for the study of diseases implicating regulation 
factors and in particular the protein coded by htfUIA. 

The gene of the present invention can also be used for 

30 the treatment of certain diseases through transcription 
control or in the analysis of the pathogenies of these 
diseases . 

The present invention therefore envisages the use of the 
htfUIA gene of the present invention and the hTFIIIA protein 
35 of the present invention to contribute in particular to the 
understanding of the transcription mechanism in human beings 
and also to contribute to the understanding, in the diagnosis 
and treatment of diseases linked to a disturbance in the 



transcription mechanism. Thus hTfUIA and the htFIIIA 
protein could be used in the diagnosis or identification of 
hereditary diseases such as certain cancers or of other 
diseases resulting from abnormal transcription control. 
These factors can also be useful in the analysis of the 
transcription regulation mechanisms. 

Therefore a subject of the present invention is the use 
of the DNA sequence of the gene of the human transcription 
factor htfUIA or of the polypeptide having the function of 
human transcription factor coded by the said DNA sequence as 
it is defined above, for the preparation of compositions 
useful in the diagnosis or treatment of diseases linked to a 
disorder in transcription control. 

Such compositions are prepared under the usual 
conditions known to a person skilled in the art. 

A more precise subject of the present invention is the 
use as defined above in which the disease concerned is 
cancer. Figures 1 to 5 below show the following 
illustrations. Figure 1 represents the comparison of the 
hTFIIIA protein of the present invention with the DREW 
hTFIIIA protein. 

Figure 2 represents the comparison of the hTFIIIA protein of 
the present invention with the ARAKAWA hTFIIIA protein. 
Figure 3 represents the comparison of the DREW hTFIIIA 
protein with the ARAKAWA hTFIIIA protein. 
Figure 4 represents the DREW htfUIA sequence and the ' 
corresponding hTFIIIA protein. 

Figure 5 represents the ARAKAWA htfUIA sequence and the 
corresponding hTFIIIA protein. 

The sequences indicated in the present invention i.e.: SEQ ID 
N°l to SEQ ID N°10 are described below. 

The experimental part below allows the description of the 
present invention without however limiting it. 
Experimental part 

Example 1 : cloning and sequencing of the hTFIIIA gene 

I) Extraction of total RNA originating from the RAJI human 

cell line (RNA Plus, BIOPROBE) 

The RAJI human cell line was chosen as a source of total RNA. 
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The RAJI cells used were cultured under the usual culture 
conditions for this line known to a person skilled in the 
art. 

To extract the total RNA of these cells a standard protocol 
is carried out using RNA Plus ® (BIOPROBE SYSTEMS) commercial 
extraction solution. 
Then the following is carried out: 

a ) homogenization : 

The cells cultured in suspension are pelleted without being 
washed beforehand in order to avoid the risk of degradation 
of the mRNA then are lysed by adding the extraction solution 
of the RNA Plus ® kit at a rate of 6 ml per 10 7 cells. The 
samples of homogenate obtained can be stored at - 70 °C. 

b) extraction of the RNA : 

After homogenization, the homogenate obtained in a) above is 
left at 4°C for 5 minutes in order to allow the complete 
disassociation of the nucleoproteinic complexes then 0.2ml of 
chloroform per 1ml of the RNA Plus ® solution is added, as 
above in a) , the medium is agitated vigorously for 15 seconds 
and left to rest in ice for 5 minutes, followed by 
centrifuging at 12000 g and at 4°C, for 15 minutes. 
Two clearly visible phases then form: the DNA and the 
proteins are found in the organic phase (lower phase) and at 
the interface. The RNA is in the aqueous phase (upper phase) 
which represents approximately 40 to 50 % of the total 

I 

volume . 

c) Precipitation of the RNA : 

The aqueous phase obtained in b) is transferred into a new 
tube, a volume of isopropanol is added and the sample is 
placed at 4°C for 15 minutes, followed by centrifuging for 15 
minutes at 4°C and at 1200 g. A precipitate is obtained which 
forms a yellow-white pellet at the bottom of the tube. 

d) Washing the RNA : 

The supernatant of the solution obtained in c) is eliminated 
then the pellet is washed with a 75 % ethanol solution using 
at least 0.8 ml of ethanol per 50 to 100 micrograms of RNA. 
The medium is mixed (vortex), centrifuged for 10 minutes at 
7500 g at 4°C and dried under vacuum. The RNA obtained is 
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then taken up in 60 microlitres of Tris 10 mM EDTA 1 mM 
pH=7 .5. 

II) Synthesis of cDNA 

a) Reagents used : 

The commercial kit Gene Amp® RNA PCR Kit (Perken Elmer) was 
used for this cDNA synthesis. 

By using this kit, the reverse transcription of RNA to cDNA 
is firstly obtained by reverse transcriptase MuLV (Murine 
Leukaemia Virus) . An RNase inhibitor isolated from human 
placenta is included in order to inhibit certain mammalian 
RNases. The fragments of cDNA are amplified by polymerase 
chain reaction (PCR) . The enzyme used for this reaction is 
pfu polymerase (Stratagene) . 

The term dNTP designates the dGTP, dATP, dTTP and dCTP 
nucleotides . 

The term PCR Buffer designates the solution containing 500 mM 
KC1 and 100 mM HCl at pH 8.3. 

The term 01igod(T)16 designates a nucleotide sequence 
constituted by 16 dTTP nucleotides. 

Oligonucleotides are used as primers in the technique 
described below. 

The concentrations indicated below represent the final 
concentrations in the reaction medium. 

b) Synthesis of the cDNA by reverse transcription : 

2 microlitres of the total RNA (1 microgram) obtained above 
in l)d) are pre-incubated at 65°C for 5 minutes, then 8 
microlitres of the following reaction solution: 5mM MgC12, 
lxPCR buffer, 1 mM of each dNTP, 5 % of DMSO, 1 U/microlitres 
of RNase inhibitor, 2.5 U/microlitres of reverse 
transcriptase MuLV, 2.5 microlitres of oligo(dT)16 is added. 
The solution is then incubated at 42°C for one hour, then at 
99°C for 5 minutes then at 5°C for 5 minutes. 
Ill) Amplification by PCR, cloning and sequencing of the 3' 
and 5' nucleotide sequences 
a) Reaction conditions : 

Escherichia coli (E. coli) XL1- Blue type K12 (Stratagene) 
bacteria was used for the preparation of the plasmids of the 
present invention. 
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Growth of this bacteria was carried out according to the 
usual conditions in LB liquid medium which contains 10 g of 
bactotryptone, 5 g of yeast extract and 10 g of NaCl per 
litre of water and which also contains 100 microg/ml of 
5 ampicillin (SIGMA) . 

The colony was removed onto a solid LB + agar + ampicillin 
medium then cultured in 100 ml of LB medium and incubated to 
OD (600nm) = 0.8. 

The incubation was carried out at 37 °C under a normal 

10 atmosphere and agitation at 225 rpm. 

The viability of the strain is verified when the strain grows 
on LB + ampicillin medium at 100 microg/ml, the insert 
containing a gene for resistance to ampicillin bla. 
It can be noted that a gene for resistance to ampicillin bla 

15 is part of the vector of the kit (TA cloning Kit - 

Invitrogen) in which the fragments of htfUIA are cloned. 
Thus, selection of strains containing the plasmids containing 
the htfUIA gene of the present invention can be carried out 
by culture of the strains in this medium which contains 

20 ampicillin (100 microg/ml) , such a medium allowing the 
survival only of strains which contain the gene for 
resistance to ampicillin and therefore only strains which 
contain the htfUIA gene of the present invention. 
For the preservation of the strains obtained, 15 % glycerol 

25 is added to the culture medium: the cultures are therefore 

preserved in the LB + 100 micrograms /ml of ampicillin + 15 % 
of glycerol at the bacterial concentration of OD (600nm) = 
0.8 suspension medium in the form of aliquots in cryotubes of 
1 ml per tube. 

30 For the sequencing, the plasmid DNA of several bacteria 

originating from each of the cloning procedures indicated 
below is prepared using a commercial kit (Qiagen Plasmids 
kit) . The fragments corresponding to the htflll coding 
sequence are sequenced on the two strands according to 

35 standard techniques known to a person skilled in the art (use 
of the sequencer ABI 377 XL, Perkin Elmer) 
t>) Amplification by PCR, and cloning of the 3' and 5' 
nucleotide sequences : 
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1) Amplification and cloning of the 3 r nucleotide sequence 
Two amplification primers (primers) were chosen according to 
the published ARAKAWA HTfUIA sequence. These 0LT3 or 
TFIIIA3'SmaI and OLT5.2 primers are called SEQ ID N°10 and 
SEQ ID N°6 respectively. 

These oligonucleotides are chosen from the hTFIIIA sequence 
published by ARAKAWA (Figure 5) and are synthesized according 
to standard methods known to a person skilled in the art. 
The TFIIIA3'SmaI oligonucleotide introduces a restriction 
site Smal downstream of the coding sequence. This site will 
allow the fusion of the htfUIA 3' nucleotide sequence with a 
coding sequence for the hemaglutinine TAG peptide. 
Thus, the peptide resulting from the expression of the cloned 
sequence will therefore consist of both the htfUIA sequence 
of the present invention and that of TAG HA and can therefore 
be detected by Western analysis according to usual techniques 
known to a person skilled in the art. 

The following process is then carried out: 2 microlitres of 
cDNA obtained above in II) b) is added to 50 microlitres of 
the following reaction solution: 2mM MgC12, IxPCR buffer, 200 
nanograms /ml of each dNTP, the TFIIIA3'SMAI and OLT5 . 2 
primers at a rate of 0.15 micromoles/1 for each, 5 % DMSO and 
2.5 U AmpliTaq DNA polymerase. 

The cDNA is thus subjected to 30 PCR cycles firstly at 94 °C , 

for one minute then at 65 °C for 1 minute then at 72 °C for 1 

minute. ' 

The products amplified by PCR thus obtained are therefore 3' 

fragments of approximately 970 base pairs. 

The 3' fragments obtained above are cloned in the pCRII 

vector using the TA cloning Kit (Invitrogen) 

The plasmid thus obtained is called 5.2 Raji 2.9. 

This plasmid is transferred into the XL1 Blue 

E. coli strain. 

The E. coli strain transformed by the plasmid 5.2 Raji 2.9 is 
thus obtained. 

2) Amplification and cloning of the 5' nucleotide sequence 
The 5' portion of the htfUIA gene of the present invention 
was isolated using the said 5' anchored PCR technique using a 
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commercial kit (5' RACE System, Rapid Amplification of cDNA 
Ends, GIBCO BRL) . 

Two amplification primers (primers) were chosen from the 
published ARAKAWA htfUIA sequence (cf . Figure 5) . 
These TFIIIAPCR5' and TFIIIA SEQ2 primers are called SEQ ID 
N°8 and SEQ ID N°7 respectively. 

A homopolymeric chain is added to the 3' end of the cDNA 
using dATP and terminal deoxynucleotidyl transferase (TdT) : 
10 microlitres of cDNA obtained above in II) b) are incubated 
at 37 °C for 10 minutes in the 1 X tailing buffer reaction 
solution (Commercial kit solution) and 0.2 mM of dATP and 
TdT. The TdT is deactivated for 10 minutes at 65 °C and the 
reaction is then brought to 4°C. 

The reaction is then directly amplified by PCR: 10 
microlitres of the TdT reaction are added to 50 microlitres 
of PCR reaction solution i.e. 1.5 mM of MgC12, IxPCR buffer, 
200 nanomoles/ml of each dNTP, UAP and TFIIIA PCR5 ' primers 
at a rate of 0.2 micromoles/1 for each, 5 % DMSO and 2.5 U 
AmpliTaq DNApolymerase . 

The UAP primer is an oligonucleotide provided with the 
commercial kit. 

The cDNA is thus subjected to 30 PCR cycles, firstly at 94 °C 
for one minute, then at 65°C for 1 minute then at 72 °C for 1 
minute . 

The products amplified by this first PCR i.e. PCR1 are 
subjected to a second amplification reaction by PCR using the 
UAP primer and a specific TFIIIASEQ 2 primer. The following 
process is carried out: 5 microlitres of PCR1 are added to 50 
microlitres of the PCR reaction solution indicated below (1.5 
mM of MgC12, IxPCR buffer, 200 micromoles/1 of each dNTP, the 
UAP and TFIIIA SEQ2 primers at a rate of 0 . 2 micromoles/1 for 
each, 5 % DMSO and 2.5 U AmpliTaq DNA polymerase. 
The DNA is then subjected to 30 PCR cycles, firstly at 94 °C 
for one minute, then at 65°C for 1 minute then at 72°C for 1 
minute . 

The products amplified by this second PCR i.e. PCR2 are 
purified on agarose gel. The 5' fragments of approximately 
380 base pairs are thus isolated. 
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The 5' fragments obtained above are thus cloned in the pCRII 

vector using the TA cloning Kit (Invitrogen) . 

The plasmid thus obtained is called cDNA-DMSO-3 

This plasmid is transferred into the XL1 Blue E. coli strain. 

The E. coli strain transformed by the plasmid cDNA-DMSO-3 is 

thus obtained. 

3) Verification of the 5' sequence by amplification of the 
genomic DNA - Construction of the 5 geno-3 plasmid 
Human genomic DNA is extracted from human liver cells 
according to the usual methods known to a person skilled in 
the art. 

Amplification by PCR of the human genomic DNA is carried out 
in the following manner: 

2 micrograms of human genomic DNA obtained as indicated above 
is added to 100 microlitres of the following PCR reaction 
solution: 2mM MgC12, 1 x native Pfu DNA polymerase buffer, 
200 nanograms/ml of each dNTP, the OLT5 and TFIIIA SEQ2 
primers at a rate of 0.15 micromoles/1 for each, 5 % DMSO and 
5 U pfu polymerase. 

0LT5 and TFIIIA SEQ2 are called SEQ ID N°5 and SEQ ID N°7 
respectively . 

The reaction medium is thus subjected to 30 PCR cycles, 
firstly at 94°C for one minute, then at 60°C for 1 minute, 
then at 72°C for 1 minute. 

The products amplified by PCR thus obtained are fragments of 
DNA of approximately 360 base pairs. 

The fragments thus obtained are cloned in the pCR-Script 
vector using the pCR-Script SK( + ) Cloning kit ( Stratagene ) . 
The plasmid thus obtained is called 5 geno-3. 

This plasmid is transferred into the XL1 Blue E. coli strain. 
The E. coli strain transformed by the plasmid 5 geno-3 is 
thus obtained. 

4) Cloning of the htfUIA gene according to the present 
invention . 

Construction of the pYX TFIIIALHA plasmid 

The complete htfUIA coding sequence is restored by assembly 
of the two 3' and 5' fragments obtained above in III) b)l) 
and III) b) 3) . 
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A Hind III restriction site located on each of the 3' and 5' 
fragments obtained above makes it possible to restore the 
complete sequence. 

The 5 geno-3 plasmid obtained above in III) b)3) is digested 
by the EcoRl and Hindlll restriction enzymes. 
The EcoRl site is located 11 nucleotides upstream of the 
coding sequence. 

Fragments of approximately 350 base pairs are obtained after 
purification on agarose gel. 

Ligation with the vector pYX/EcoRI + Hindlll is then carried 
out and the vector pYXTFIIIA5' is obtained. 

The addition of the 3' fragment to the 5' fragment is then 
carried out: the 5.2 Raji 2.9 plasmid obtained above in III) 
b)l), is digested by the restriction enzymes Hindlll and 
Smal . 

After purification on agarose gel, a fragment of 
approximately 930 base pairs is obtained. This fragment is 
inserted into the pYXTFIIIA5' plasmid obtained above, 
previously digested by the restriction enzymes Smal and 
Hindlll. 

The pYXTFI I IALHA plasmid is thus obtained which therefore 
contains the hTFIIIA gene of the present invention. 
Example 2 : Construction of the XL1 Blue/pYX TFIIIALHA strain 
The preparation of the XLl-Blue/ pYX TFIIIALHA strain, is 
carried out according to techniques known to a person skilled 
in the art (ref above: Sambrook, Fritsh and Maniatis) from 
the XL1- Blue type K12 E. coli strain ( Stratagene ) , and the 
pYX TFIIIALHA plasmid obtained above in Example 1 is 
introduced. 

Example 3 : Construction of the bpS-tfC2LHA plasmid 
The vector bpS-SK+ (Stratagene) is used, in which an insert 
coding for the htFIIIA gene of the present invention is 
integrated. The following process is carried out: the 
pYXTFIIIALHA plasmid obtained above in Example 1 is digested 
by the restriction enzyme EcoRl, this end is filled using DNA 
Polymerase (Klenow fragment) in the presence of dNTP. This 
plasmid is then digested by Nhe I and the fragment 
corresponding to the htfUIA sequence according to the 
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present invention is purified. This fragment is inserted 
into the bpS-SK+ vector prepared as follows: the vector is 
digested by EcoRI, this site is filled using DNA polymerase 
then digested by Xbal. 
5 The plasmid bpS-tfC2LHA is thus obtained. 

Example 4 : Construction of the XLl-Blue/bpS-t f C2LHA strain 
For the preparation of the XLl-Blue/bpS-t f C2LHA strain, 
techniques known to a person skilled in the art, using XL1- 
Blue type K12 E. coli strain (Stratagene) are carried out, 
10 and the bpS-tfC2LHA plasmid obtained above in Example 3 is 
introduced. 

A sample of the strain obtained i.e. XL1- Blue type K12 E. 
Coli (Stratagene) containing the bpS-SK+ vector (Stratagene) 
with an insert coding for tfC2 (cDNA coding part containing 
15 the htfUIA coding region) i.e. XLl-Blue/bps-tf C2LHA coding 

region was deposited at L'Institut Pasteur 25, rue du Docteur 
ROUX Paris 75015 at the CNCM on the 15th September 1998 under 
the number 1-2071. 

Example 5 : Identification of the start codon of proteinic 

20 synthesis . 

Purification of the hTFIII protein was described by 
Moorefield et al (1994) [reference: the Journal of Biological 
Chemistry, Vol. 269, N° 33, pp. 20857-20865, 1994, 
Purification and Characterization of Human Transcription 

25 Factor IIIA, B. Moorefield and R. G. Roeder] . 

The hTFIHA protein identified by Moorefield has a molecular 
weight of 42 kDa . It can be noted that the theoretical 
molecular weight of the htFIIIA protein coded by the Arakawa 
htfUIA sequence is 47 kDa. 

30 Proteinic synthesis is generally started at an ATG codon. 

However the htfUIA coding sequence of the present invention 
does not contain the ATG codon in phase. 

It has been demonstrated that the different ATG codons, in 
particular the CTG or GTG codons are start codons of 
35 translation in natural cellular transcripts. 

With techniques known to a person skilled in the art such as 
translation experiments in vitro with the htfUIA sequence 
according to the invention obtained above in Example 1, and 
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by expression tests in mammalian cells such as Cos cells, the 
start codon of hTFIIIA proteinic synthesis according to the 
present invention was demonstrated. 

Within the scope of the present invention, it has thus been 
5 demonstrated that the start codon of htfUIA proteinic 

synthesis according to the present invention is the CTG codon 
which is found in position 176-178 of SEQ ID N°3. 
Analysis of the results 

Analysis of the results obtained by the preparations of the 
10 examples indicated above reveal the following points relating 
to the htfUIA coding sequence: 

- in 3' (above in III) b)l)) the main part of the sequence 
isolated in the present Application corresponds to the DREW 
htfUIA sequence 

15 - in 5' (above in III) b)3)) the longest sequence of 

fragments obtained by the preparation described above in III) 
b)3) begins in position 20 of the ARAKAWA htfUIA sequence 
and reveals the insertion of a nucleotide in position 127 of 
the ARAKAWA htfUIA sequence. 

20 The results obtained by the preparations of htfUIA described 
above according to the present invention confirm that 
omission of a nucleotide in position 127 in the ARAKAWA 
sequence really does exist in the human htfUIA gene. 



I 
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CLAIMS 

1) DNA sequence of the htfUIA gene coding for a protein 
having the biological function of human transcription factor 
hTFIIIA. 

5 2) DNA sequence of the htfUIA gene of the human 

transcription factor hTFIIIA according to claim 1, coding for 
the amino acid sequence SEQ ID N°2. 

3) DNA sequence of the htfUIA gene according to claim 1 or 2 
containing the nucleotide sequence SEQ ID N°3 

10 4) DNA sequence of the htfUIA gene according to claims 1 to 
3 containing the nucleotide sequence SEQ ID N°4. 
5) DNA sequence according to claim 4 having the sequence 
beginning at nucleotide 176 and finishing at the nucleotide 
1270 of SEQ ID N°3. 

15 6) DNA sequence coding for the human transcription factor 
hTFIIIA according to claims 1 to 5 as well as the DNA 
sequences which hybridize with it and/or show a significant 
homology with this sequence or fragments of it and which code 
for a protein with the same function. 

20 7) DNA sequence according to claims 1 to 6 comprising 

modifications introduced by suppression, insertion and/or 
substitution of at least one nucleotide coding for a protein 
with the same biological activity as human transcription 
factor hTFIIIA. 

25 8) DNA sequence according to one of claims 1 to 7 as well as 
similar DNA sequences which have nucleotide sequence homology 
of at least 50 % or at least 60 % and preferably at least 70 
% with the said DNA sequence. 

9) DNA sequence according to one of claims 1 to 8 as well as 
30 similar DNA sequences which code for a protein, the AA 

sequence of which has a homology of at least 40 % and in 
particular 45 % or at least 50 %, rather at least 60 % and 
preferably at least 70 % with the AA sequence coded by the 
said DNA sequence. 
35 10) Polypeptide having the function of human transcription 
factor hTFIIIA and with the amino acid sequence SEQ ID N°2 
coded by the DNA sequence according to one of claims 1 to 9 
and the analogues of this polypeptide. 
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11) Process for the preparation of the hTFIIIA recombinant 
protein having the amino acid sequence SEQ ID N°2 comprising 
the expression of the DNA sequence according to one of claims 
1 to 9 in a appropriate host, then isolation and purification 

5 of the said recombinant protein. 

12) Expression vector containing the DNA sequence according 
to one of claims 3 to 9. 

13) Host cell transformed with a vector according to claim 12 

14) Plasmid deposited at the CNCM under the number 1-2071. 
10 15) Use of the human transcription factor htfUIA gene or of 

the human transcription factor coded by this gene according 
to one of the claims 1 to 10 for the preparation of 
compositions which can be used for the diagnosis or treatment 
of diseases linked to a disorder in transcription control. 
15 16) Use according to claim 15 for which the disease concerned 
is cancer . 
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1 MDPPAVVAESVSSLTIADAFIAAGESSAPTPPRPALPRRFICSFPDCSAN 50 

M i I I II I I I I f I I I i I I I i I I I II II I I II I I I I I I M I I I I I I I I j 
1 . . PPAVVAESVSSLTIADAFIAAGESSAPTPPRPALPRRFICSFPDCSAN 4 8 

51 YSKAWKLDAHLCKHTGERPFVCDYEGCGKAFIRDYHLSRHILTHTGEKPF 100 

M I I I I I II I I I I I II I I I I I I I I i I | I | | M | M I I I I I I I I I I I I I | | 
4 9 YSKAWKLDAHLCKHTGERPFVCDYEGCGKAFIRDYHLSRHILTHTGEKPF 98 

101 VCAATGCDQKFNTKSNLKKHFERKHENQQKQYICSFEDCKKTFKKHQQLK 150 
I I I I I I I M I I I I I I I M I I II II I I I II I II I I I I I I I I ! I 1 | | | | | | 
99 VCAANGCDQKFNTKSNLKKHFERKHENQQKQYICSFEDCKKTFKKHQQLK 148 

151 IHQCQHTNEPLFKCTQEGCGKHFASPSKLKRHAKAHEGYVCQKGCSFVAK 200 

I I I I I I I I I I M I I I I I I II I I I I II I I I I I II I II I I I I I I I I I I I | | | 
14 9 IHQCQHTNEPLFKCTQEGCGKHFASPSKLKRHAKAHEGYVCQKGCSFVAK 198 

201 TWTELLKHVRETHKEEILCEVCRKTFKRKDYLKQHMKTHAPERDVCRCPR 250 

I M I I I II I I I I I I I I I I I I I I II M I I I I I I I I I I I I I I I | | | || | | | | 
199 TWTELLKHVRETHKEEILCEVCRKTFKRKDYLKQHMKTHAPERDVCRCPR 24 8 

251 EGCGRTYTTVFNLQSHILSFHEESRPFVCEHAGCGKTFAMKQSLTRHAVV 300 

I M I I I M I II I I I I I I II I I M I | ! | || | | | || || | || M M I II I II I 
24 9 EGCGRTYTTVFNLQSHILSFHEESRPFVCEHAGCGKTFAMKQSLTRHAVV 298 

301 HDPDKKKMKLKVKKSREKRSLASHLSGYIPPKRKQGQGLSLCQNGESPNC 350 

N I II I I I I I I I I I I I I I I I I I | || | | | | | | M | || f | | J || | | || || | | 
299 HDPDKKKMKLKVKKSREKRSLASHLSGYIPPKRKQGQGLSLCQNGESPNC 348 

351 VEDKMLSTVAVLTLG 365 

M I I II II I II I M I 
34 9 VEDKMLSTVAVLTLG 3 63 
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i MDPPAVVAESVSSLTIADAFIAAGESSAPTPPRPALPRRFIC 4 2 

: I I M I I I I I I I M ! I I I I i I I M I I I I I I I | | | | | | | | [| | 
51 PGLGGAGALDPPAVVAESVSSLTIADAFIAAGESSAPTPPRPALPRRFIC 100 

4 3 SFPDCSANYSKAWKLDAHLCKHTGERPFVCDYEGCGKAFIRDYHLSRHIL 92 
M I I I I M I I I I I I ! I I I I I I I I I I I I I I I I! I I I I I I I I I I I I I I I I 1 I 
101 SFPDCSANYSKAWKLDAHLCKHTGERPFVCDYEGCGKAFIRDYHLSRHIL 150 

93 THTGEKPFVCAATGCDQKFNTKSNLKKHFERKHENQQKQYICSFEDCKKT 142 
I I I I I i I I I I I ! N I M I I I I II I I I II II I I I I I I M I I I I II I I I I | 
151 THTGEKPFVCAANGCDQKFNTKSNLKKHFERKHENQQKQYICSFEDCKKT 200 

143 FKKHQQLKIHQCQHTNEPLFKCTQEGCGKHFASPSKLKRHAKAHEGYVCQ 192 

N I I I I 1 I I I M I . I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
2 01 FKKHQQLKIHQCQNTNEPLFKCTQEGCGKHFASPSKLKRHAKAHEGYVCQ 25 0 

193 KGCSFVAKTWTELLKHVRETHKEEILCEVCRKTFKRKDYLKQHMKTHAPE 24 2 

I ' I I I M I M I I II II I I I I I I ! I I M M I I I M I I I I I I I I I I I I I I I I 
251 KGCSFVAKTWTELLKHVRETHKEEILCEVCRKTFKRKDYLKQHMKTHAPE 300 

24 3 RDVCRCPREGCGRTYTTVFNLQSHILSFHEESRPFVCEHAGCGKTFAMKQ 2 92 

I I I ! I I I M ! I I I I I I I I I I I I | | || | M I I I I II I I I I II I I || I | || | 
301 RDVCRCPREGCGRTYTTVFNLQSHILSFHEESRPFVCEHAGCGKTFAMKQ 350 

2 93 SLTRHAWHDPDKKKMKLKVKKSREKRSLASELSGYIPPKRKQGQGLSLC 34 2 

M I I I I I I I I I I I I I I M I II I I II I I I I I I I I I I I I I | | 

351 SLTRHAVVHDPDKKKMKLKVKKSREKREFGLSSQWIYPPKRKQGQGLSLC 400 

34 3 QNGESPNCVEDKMLSTVAVLTLG 365 

I I I I M ! I I I I I I I I I I II I II I 
4 01 QNGESPNCVEDKMLSTVAVLTLG 423 
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RDVCRCPREGCGRTYTTVFNLQSHILSFHEESRPFVCEHAGCGKTFAMKQ 290 


351 
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400 
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1 CCGCCGGCCGTGGTCGCCGAGTCGGTGTCGTCCTTGACCATCGCCGACGC 50 
1PPAVVAESVSSLTIADA 17 

51 GTTCATTGCAGCCGGCGAGAGCTCAGCTCCGACCCCGCCGCGCCCCGCGC 100 

18 FIAAGESSAPTPPRPAL 34 

101 TTCCCAGGAGGTTCATCTGCTCCTTCCCTGACTGCAGCGCCAATTACAGC 150 

35 PRRFICSFPDCSANYS 50 

151 AAAGCCTGGAAGCTTGACGCGCACCTGTGCAAGCACACGGGGGAGAGACC 200 

51 KAWKLDAHLCKHTGERP 67 

2 01 ATTTGTTTGTGACTATGAAGGGTGTGGCAAGGCCTTCATCAGGGACTACC 250 

68 FVCDYEGCGKAFIRDYH 84 

251 ATCTGAGCCGCCACATTCTGACTCACACAGGAGAAAAGCCGTTTGTTTGT 300 

85 LSRHILTHTGEKPFVC 100 

301 G CAG CC AAT GGC T G T GAT CAAAAAT TC AACAC AAAAT C AAAC T T GAAGAA 350 

101 AANGCDQKFNTKSNLKK 117 

351 AC AT T T TG AAC G CAAAC AT G AAAAT C AACAAAAAC AAT AT AT AT G CAG T T 4 00 

118 HFERKHENQQKQYICSF 134 

4 01 T T G AAG AC T GT AAG AAG AC C T T T AAG AAAC AT CAGC AGCT GAAAAT C CAT 4 50 

135 EDCKKTFKKHQQLKI H 150 

4 51 CAGTGCCAGCATACCAATGAACCTCTATTCAAGTGTACCCAGGAAGGATG 5 00 

151QCQHTNEPLFKCTQEGC 167 

501 TGGGAAACACTTTGCATCACCCAGCAAGCTGAAACGACATGCCAAGGCCC 550 

168 GKHFASPSKLKRHAKAH 184) 

551 ACGAGGGCTATGTATGTCAAAAAGGATGTTCCTTTGTGGCAAAAACATGG 600 

185 EGYVCQKGCSFVAKTW 200 

601 AC G G AACT T C T GAAAC AT G T G AG AG AAACC C AT A7AAG AGG AAAT AC TAT G 650 

201TELLKHVRETHKEEILC 217 
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651 TGAAGTATGCCGGAAAACATTTAAACGCAAAGATTACCTTAAGCAACACA 7 00 

218 EVCRKT F'fCRKDYLKQHM 234 

701 TGAAAACTCATGCCCCAGAAAGGGATGTATGTCGCTGTCCAAGAGAAGGC 7 50 

235 KTHAPERDVCRCPREG 250 



751 TGTGGAAGAACCTATACAACTGTGTTTAATCTCCAAAGCCATATCCTCTC 800 
251 CGRTYTTVFNLQSHI LS 267 



801 CTTCCATGAGGAAAGCCGCCCTTTTGTGTGTGAACATGCTGGCTGTGGCA 850 
268 FHEESRPFVCEHAGCGK 284 



851 AAACATTTGCAATGAAACAAAGTCTCACTAGGCATGCTGTTGTACATGAT 900 
285 TFAMKQSLTRHAVVHD 300 



901 C C T G AC AAG AAG AAAA T G AAG C T C AAAG T C AAAAAA T C T C G T G AAAAAC G 95 0 
301 PDKKKMKLKVKKSREKR 317 



951 GAGTTTGGCCTCTCATCTCAGTGGATATATCCCTCCCAAAAGGAAACAAG 1000 

318 SLASHLSGYI PPKRKQG 334 

1001 GGCAAGGCTTATCTTTGTGTCAAAACGGAGAGTCACCCAACTGTGTGGAA 1050 

335 QGLSLCQNGES PNCVE 350 



1051 GACAAGATGCTCTCGACAGTTGCAGTACTTACCCTTGGCTAAGAACTGCA 1100 
351 DKMLSTVAVLTLG * 364 



1101 CTGCTTTGTTTAAAGGACTGCAGACCAAGGAGCGAGCTTTCTCTCAGAGC 1150 
1151 ATGCTTTTCTTTATTAAAATTAC 1173 
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1 ATGCGCGATCTCCCGGAGCATGCGCAGCAGCGGCGCCGACGCGGGGCGGT 50 
1 MRSSGADAGRC 11 



51 GCCTGGTGACCGCGCGCGCTCCCGGAAGTGTGCCGGCGTCGCGCGAAGGT 100 
12 LVTARAPGSVPASREG 27 



101 TCAGCAGGGAGCCGTGGGCCGGGCGCGCGGTTCCCGGCACGTGTCTCGGC 150 
28 SAGSRGPGARFPARVSA 44 



151 ACGTGGCAGCGCGCCTGGCCCTGGGCTTGGAGGCGCCGGCGCCCTGGATC 20 0 
45 RGSAPGPGLGGAGALDP 61 



201 CGCCGGCCGTGGTCGCCGAGTCGGTGTCGTCCTTGACCATCGCCGACGCG 250 
62 PAVVAESVSSLTIADA 77 



251 TTCATTGCAGCCGGCGAGAGCTCAGCTCCGACCCCGCCGCGCCCCGCGCT 300 
78FIAAGESSAPTPPRPAL 94 



301 TCCCAGGAGGTTCATCTGCTCCTTCCCTGACTGCAGCGCCAATTACAGCA 350 
95 PRRFICS FPDCSANYSK 111 



351 AAGCCTGGAAGCTTGACGCGCACCTGTGCAAGCACACGGGGGAGAGACCA 4 00 
112 AWKLDAHLCKHTGERP 127 



4 01 TTTGTTTGTGACTATGAAGGGTGTGGCAAGGCCTTCATCAGGGACTACCA 450 
128FVCDYEGCGKAFIRDYH 144 



4 51 TCTGAGCCGCCACATTCTGACTCACACAGGAGAAAAGCCGTTTGTTTGTG 500 
145 LSRHILTHTGEKPFVCA 161 



5 01 CAGCCAATGG CTG T GATC AAAAAT TCAACACAAAATCAAAC T TGAAGAAA 5 50 
162 ANGCDQKFNTKSNLKK 177, 



551 CATTTTGAACGCAAACATGAAAATCAACAAAAACAATATATATGCAGTTT 600 
178 HFERKHENQQKQYICSF 194 



601 TGAAGACTGTAAGAAGACCTTTAAGAAACATCAGCAGCTGAAAATCCATC 65 0 

195 EDCKKT FKKHQQLK I HQ 211 

651 AGTGCCAGAATACCAATGAACCTCTATTCAAGTGTACCCAGGAAGGATGT 7 00 

212 CQNTNEPLFKCTQEGC 227 
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7 01 GGGAAACACTTTGCATCACCCAGCAAGCTGAAACGACATGCCAAGGCCCA 7 50 
228 GKH FAS PSKLKRHAKAH 244 



751 CGAGGGCTATGTATGTCAAAAAGGATGTTCCTTTGTGGCAAAAACATGGA 800 
245 EGYVCQKGCS FVAKTWT 261 



8 01 C GG AAC T T C T G AAAC AT G TG AG AG AAACC C AT AAAG AGG AAAT AC TAT G T 8 50 
262 ELLKHVRETHKEEILC 277 



851 GAAGT AT GC C GG AAAAC AT T T AAAC G C AAAG AT T ACC T T AAGC AAC AC AT 900 
278 -EVCRKT FKRKDYLKQHM 294 



901 GAAAACTCATGCCCCAGAAAGGGATGTATGTCGCTGTCCAAGAGAAGGCT 950 
295 KTHAPERDVCRCPREGC 311 



951 GTGGAAGAACCTATACAACTGTGTTTAATCTCCAAAGCCATATCCTCTCC 1000 
312 GRTYTTVFNLQSHILS 327 



1001 TTCCATGAGGAAAGCCGCCCTTTTGTGTGTGAACATGCTGGCTGTGGCAA 1050 
328 FHEESRPFVCEHAGCGK 344 



1051 AAC AT T T G C AAT GAAAC AAAG T C T CAC T AGG CAT GC T G T TG T ACAT GAT C 1100 
345 TFAMKQSLTRHAVVHDP 361 



1101 C T G AC AAG AAG AAAAT G AAG C T C AAAG T C AAAAAAT C T C G T G AAAAAC G G 115 0 
362 DKKKMKLKVKKSREKR 377 



1151 GAGTTTGGCCTCTCATCTCAGTGGATATATCCTCCCAAAAGGAAACAAGG 1200 

378 EFGLSSQWIYPPKRKQG 394 

12 01 GCAAGGCTTATCTTTGTGTCAAAACGGAGAGTCACCCAACTGTGTGGAAG 1250 

3 95 QGLSLCQNGESPNCVED 411, 



1251 ACAAGATGCTCTCGACAGTTGCAGTACTTACCCTTGGCTAAGAACTGCAC 1300 
412 KMLSTVAVLTLG* 424 



1301 TGCTTTGTTTAAAGGACTGCAGACCAAGGAGTCGAGCTTTCTCTCAGAGC 1350 



ATGCTTTTCTTTATTAAAATTACTGATGCAGAAAAAAAAAAAAAAAAAA 13 99 
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sequence listing JC08 Rec'd PCT/PTO 0 8 MAY 200T 

<110> Hoechst Marion Roussel 

<120> HUMAN htFIII GENE AND CODED htfUIA PROTEIN 

<130> 9823seq 

<140> 
<141> 

<160> 10 

<170> Patentln Vers. 2.0 

<210> 1 
<211> 1273 
<212> ADN 
<213> Human 

<220> 
<221> CDS 
C> <222> (176) . . (1270) 

<4 00> SEQ ID No: 1 
: _ ; atgcgcagca gcggcgccga cgcggggcgg tgcctggtga ccgcgcgcgc tcccggaagt 60 

\* gtgccggcgt cgcgcgaagg ttcagcaggg agccgtgggc cgggcgcgcc ggttcccggc 120 

1/ acgtgtctcg gcacgtggca gcgcgcctgg ccctgggctt ggaggcgccg gcgcc ctg 178 
w 1 Leu 
« 1 

o 

{.;: gat ccg ccg gcc gtg gtc gcc gag teg gtg teg tec ttg ace ate gee 226 

i'i Asp Pro Pro Ala Val Val Ala Glu Ser Val Ser Ser Leu Thr He Ala 

l\ 5 10 15 

'■sis 

■"7 gac gcg ttc att gca gcc ggc gag age tea get ccg acc ccg ccg cgc 274 

f!r " Asp Ala Phe He Ala Ala Gly Glu Ser Ser Ala Pro Thr Pro Pro Arg 
20 25 30 

ccc gcg ctt ccc agg agg ttc ate tgc tec ttc cct gac tgc age gcc 322 
Pro Ala Leu Pro Arg Arg Phe He Cys Ser Phe Pro Asp Cys Ser Ala 
35 40 45 

aat tac age aaa gcc tgg aag ctt gac gcg cac ctg tgc aag cac acg 370 
Asn Tyr Ser Lys Ala Trp Lys Leu Asp Ala His Leu Cys Lys His Thr 
50 55 . 60 65 

ggg gag aga cca ttt gtt tgt gac tat gaa ggg tgt ggc aag gcc ttc 418 
Gly Glu Arg Pro Phe Val Cys Asp Tyr Glu Gly Cys Gly Lys Ala Phe 
70 75 80 

ate agg gac tac cat ctg age cgc cac att ctg act cac aca gga gaa 4 66 
He Arg Asp Tyr His Leu Ser Arg His He Leu Thr His Thr Gly Glu 
85 90 95 



aag ccg ttt 
Lys Pro Phe 
100 

aaa tea aac 
Lys Ser Asn 
115 

aaa caa tat 
Lys Gin Tyr 
130 

cat cag cag 
His Gin Gin 



ttc aag tgt 
Phe Lys Cys 



aag ctg aaa 
Lys Leu Lys 
180 

gga tgt tec 
Gly Cys Ser 
195 

aga gaa acc 
Arg Glu Thr 
210 

ttt aaa cgc 
Phe Lys Arg 



gaa agg gat 
Glu Arg Asp 



act act gtg 
Thr Thr Val 
260 

age cgc cct 
Ser Arg Pro 
275 

atg aaa caa 
Met Lys Gin 
290 

aag aaa atg 
Lys Lys Met 



gee tct cat 



gtt tgt gca 
Val Cys Ala 



ttg aag aaa 
Leu Lys Lys 



ata tgc agt 
lie Cys Ser 
135 

ctg aaa ate 
Leu Lys lie 
150 

acc cag gaa 
Thr Gin Glu 
165 

cga cat gee 
Arg His Ala 



ttt gtg gca 
Phe Val Ala 



cat aaa gag 
His Lys Glu 
215 

aaa gat tac 
Lys Asp Tyr 
230 

gta tgt cgc 
Val Cys Arg 
245 

ttt aat etc 
Phe Asn Leu 



ttt gtg tgt 
Phe Val Cys 



agt etc act 
Ser Leu Thr 
295 

aag etc aaa 
Lys Leu Lys 
310 

etc agt gga 



gec act ggc 
Ala Thr Gly 
105 

cat ttt gaa 
His Phe Glu 
120 

ttt gaa gac 
Phe Glu Asp 



cat cag tgc 
His Gin Cys 



gga tgt ggg 
Gly Cys Gly 
170 

aag gec cac 
Lys Ala His 
185 

aaa aca tgg 
Lys Thr Trp 
200 

gaa ata eta 
Glu lie Leu 



ctt aag caa 
Leu Lys Gin 



tgt cca aga 
Cys Pro Arg 
250 

caa age cat 
Gin Ser His 
265 

gaa cat get 
Glu His Ala 
280 

agg cat get 
Arg His Ala 



gtc aaa aaa 
Val Lys Lys 



tat ate cct 



tgt gat caa 
Cys Asp Gin 



cgc aaa cat 
Arg Lys His 
125 

tgt aag aag 
Cys Lys Lys 
140 

cag cat acc 
Gin His Thr 
155 

aaa cac ttt 
Lys His Phe 



gag ggc tat 
Glu Gly Tyr 



acg gaa ctt 
Thr Glu Leu 
205 

tgt gaa gta 
Cys Glu Val 
220 

cac atg aaa 
His Met Lys 
235 

gaa ggc tgt 
Glu Gly Cys 



ate etc tec 
lie Leu Ser 



ggc tgt ggc 
Gly Cys Gly 
285 

gtt gta cat 
Val Val His 
300 

tct cgt gaa 
Ser Arg Glu 
315 

ccc aaa agg 



aaa ttc aac 
Lys Phe Asn 
110 

gaa aat caa 
Glu Asn Gin 



acc ttt aag 
Thr Phe Lys 



aat gaa cct 
Asn Glu Pro 
160 

gca tea ccc 
Ala Ser Pro 
175 

gta tgt caa 
Val Cys Gin 
190 

ctg aaa cat 
Leu Lys His 



tgc egg aaa 
Cys Arg Lys 



act cat gee 
Thr His Ala 
240 

gga aga acc 
Gly Arg Thr 
255 

ttc cat gag 
Phe His Glu 
270 

aaa aca ttt 
Lys Thr Phe 



gat cct gac 
Asp Pro Asp 



aaa egg agt 
Lys Arg Ser 
320 

aaa caa ggg 



aca 514 
Thr 



caa 562 
Gin 



aaa 610 

Lys 

145 

eta 658 
Leu 



age 706 
Ser 



aaa 754 
Lys 



gtg 802 
Val 



aca 850 

Thr 

225 

cca 8 98 
Pro 



tat 946 
Tyr 



gaa 994 
Glu 



gca 1042 
Ala 



aag 1090 

Lys 

305 

ttg 1138 
Leu 



caa 1186 



Ala Ser His Leu Ser Gly Tyr He Pro Pro Lys Arg Lys Gin Gly Gin 
325 330 335 



ggc tta tct ttg tgt caa aac gga gag tea ccc aac tgt gtg gaa gac 
Gly Leu Ser Leu Cys Gin Asn Gly Glu Ser Pro Asn Cys Val Glu Asp 
340 345 350 



aag atg etc teg aca gtt gca gta ctt acc ctt ggc taa 
Lys Met Leu Ser Thr Val Ala Val Leu Thr Leu Gly 
355 360 365 



<210> 2 
<211> 365 
<212> PRT 
<213> Human 

<400> SEQ ID No: 2 

Leu Asp Pro Pro Ala Val Val Ala Glu Ser Val Ser Ser Leu Thr He 
15 10 15 

Ala Asp Ala Phe He Ala Ala Gly Glu Ser Ser Ala Pro Thr Pro Pro 
20 25 30 

Arg Pro Ala Leu Pro Arg Arg Phe He Cys Ser Phe Pro Asp Cys Ser 
35 40 45 

Ala Asn Tyr Ser Lys Ala Trp Lys Leu Asp Ala His Leu Cys Lys His 
50 55 60 

Thr Gly Glu Arg Pro Phe Val Cys Asp Tyr Glu Gly Cys Gly Lys Ala 
65 70 75 80 

Phe He Arg Asp Tyr His Leu Ser Arg His He Leu Thr His Thr Gly 
85 90 95 

Glu Lys Pro Phe Val Cys Ala Ala Thr Gly Cys Asp Gin Lys Phe Asn 
100 105 110 

Thr Lys Ser Asn Leu Lys Lys His Phe Glu Arg Lys His Glu Asn Gin 
115 120 125 

Gin Lys Gin Tyr He Cys Ser Phe Glu Asp Cys Lys Lys Thr Phe Lys 
130 135 140 

Lys His Gin Gin Leu Lys He His Gin Cys Gin His Thr Asn Glu Pro 
145 150 155 160 

Leu Phe Lys Cys Thr Gin Glu Gly Cys Gly Lys His Phe Ala Ser Pro 
165 170 175 

Ser Lys Leu Lys Arg His Ala Lys Ala His Glu Gly Tyr Val Cys Gin 
180 185 190 



Lys Gly Cys Ser Phe Val Ala Lys Thr Trp Thr Glu Leu Leu Lys His 

195 200 205 



Val Arg Glu Thr His Lys Glu Glu He Leu Cys Glu Val Cys Arg Lys 
210 215 220 



Thr Phe Lys Arg Lys Asp Tyr Leu Lys Gin His Met Lys Thr His Ala 
225 230 235 240 

Pro Glu Arg Asp Val Cys Arg Cys Pro Arg Glu Gly Cys Gly Arg Thr 
245 250 255 

Tyr Thr Thr Val Phe Asn Leu Gin Ser His He Leu Ser Phe His Glu 
260 265 270 

Glu Ser Arg Pro Phe Val Cys Glu His Ala Gly Cys Gly Lys Thr Phe 
275 280 285 

Ala Met Lys Gin Ser Leu Thr Arg His Ala Val Val His Asp Pro Asp 
290 295 300 

Lys Lys Lys Met Lys Leu Lys Val Lys Lys Ser Arg Glu Lys Arg Ser 
305 310 315 320 

Leu Ala Ser His Leu Ser Gly Tyr He Pro Pro Lys Arg Lys Gin Gly 
325 330 335 

Gin Gly Leu Ser Leu Cys Gin Asn Gly Glu Ser Pro Asn Cys Val Glu 
340 345 350 

Asp Lys Met Leu Ser Thr Val Ala Val Leu Thr Leu Gly 
355 360 365 



<210> 3 
<211> 1273 
<212> ADN 
<213> Human 

<400> SEQ ID No: 3 



atgcgcagca 


gcggcgccga 


cgcggggcgg 


tgcctggtga 


ccgcgcgcgc 


tcccggaagt 


60 


gtgccggcgt 


cgcgcgaagg 


ttcagcaggg 


agccgtgggc 


cgggcgcgcc 


ggttcccggc 


120 


acgtgtctcg 


gcacgtggca 


gcgcgcctgg 


ccctgggctt 


ggaggcgccg 


gcgccctgga 


180 


tccgccggcc 


gtggtcgccg 


agtcggtgtc 


gtccttgacc 


atcgccgacg 


cgttcattgc 


240 


agccggcgag 


agctcagctc 


cgaccccgcc 


gcgccccgcg 


cttcccagga 


ggttcatctg 


300 


ctccttccct 


gactgcagcg 


ccaattacag 


caaagcctgg 


aagcttgacg 


cgcacctgtg 


360 


caagcacacg 


ggggagagac 


catttgtttg 


tgactatgaa 


gggtgtggca 


aggccttcat 


420 


cagggactac 


catctgagcc 


gccacattct 


gactcacaca 


ggagaaaagc 


cgtttgtttg 


480 


tgcagccact 


ggctgtgatc 


aaaaattcaa 


cacaaaatca 


aacttgaaga 


aacattttga 


540 


acgcaaacat 


gaaaatcaac 


aaaaacaata 


tatatgcagt 


tttgaagact 


gtaagaagac 


600 



ctttaagaaa 


catcagcagc 


tgaaaatcca 


caagtgtacc 


caggaaggat 


gtgggaaaca 


tgccaaggcc 


cacgagggct 


atgtatgtca 


gacggaactt 


ctgaaacatg 


tgagagaaac 


ccggaaaaca 


tttaaacgca 


aagattacct 


aagggatgta 


tgtcgctgtc 


caagagaagg 


tctccaaagc 


catatcctct 


ccttccatga 


tggctgtggc 


aaaacatttg 


caatgaaaca 


tcctgacaag 


aagaaaatga 


agctcaaagt 




CiU tuu qt Let La 


t CCCtCCC33. 


tcaaaacgga 


gagtcaccca 


actgtgtgga 


tacccttggc 


taa 




<210> 4 
<211> 1213 
<212> ADN 
<213> Human 




<400> SEQ ID No: 4 
gtgccggcgc cgcgcgaagg 


ttcagcaggg 


acgtgtctcg 


gcacgtggca 


gcgcgcctgg 


tccgccggcc 


gtggtcgccg 


agtcggtgtc 


agccggcgag 


agctcagctc 


cgaccccgcc 


ctccttccct 


gaetgeageg 


ccaattacag 


caagcacacg 


ggggagagac 


catttgtttg 


cagggactac 


catctgagcc 


gccacattct 


tgcagccact 


ggctgtgatc 


aaaaattcaa 


acgcaaacat 


gaaaatcaac 


aaaaacaata 


ctttaagaaa 


catcagcagc 


tgaaaatcca 


caagtgtacc 


caggaaggat 


gtgggaaaca 


tgccaaggcc 


cacgagggct 


atgtatgtca 


gacggaactt 


ctgaaacatg 


tgagagaaac 



tcagtgccag 


cataccaatg 


aacctctatt 


660 


etttgeatea 


cccagcaagc 


tgaaacgaca 


ion 


aaaaggatgt 


tcctttgtgg 


caaaaacatg 


780 


ccataaagag 


gaaatactat 


gtgaagtatg 


840 


taagcaacac 


atgaaaactc 


atgccccaga 


900 


ctgtggaaga 


acctatacta 


ctgtgtttaa 


960 


ggaaagcege 


ccttttgtgt 


gtgaacatgc 


1020 


aagtctcact 


aggcatgetg 


ttgtacatga 


1080 


caaaaaatct 


cgtgaaaaac 


ggagtttggc 


1140 


aaggaaacaa 


gggcaaggct 


tatctttgtg 


1200 


agacaagatg 


ctctcgacag 


ttgeagtact 


1260 



1273 



agccgtgggc 


cgggcgcgcc 


ggttcccggc 


60 


ccctgggctt 


ggaggcgccg 


gcgccctgga 


120 


gtccttgacc 


atcgccgacg 


cgttcattgc 


180 


gcgccccgcg 


cttcccagga 


ggttcatctg 


240 


caaagectgg 


aagcttgacg 


cgcacctgtg 


300 


tgactatgaa 


gggtgtggca 


aggecttcat 


360 


gactcacaca 


ggagaaaagc 


cgtttgtttg 


420 


cacaaaatca 


aacttgaaga 


aacattttga 


480 


tatatgeagt 


tttgaagact 


gtaagaagac 


540 


tcagtgccag 


cataccaatg 


aacctctatt 


600 


etttgeatea 


cccagcaagc 


tgaaacgaca 


660 


aaaaggatgt 


tcctttgtgg 


caaaaacatg 


720 


ccataaagag 


gaaatactat 


gtgaagtatg 


780 



ccggaaaaca tttaaacgca aagattacct taagcaacac atgaaaactc atgccccaga 840 
aagggatgta tgtcgctgtc caagagaagg ctgtggaaga acctatacta ctgtgtttaa 900 
tctccaaagc catatcctct ccttccatga ggaaagccgc ccttttgtgt gtgaacatgc 960 
tggctgtggc aaaacatttg caatgaaaca aagtctcact aggcatgctg ttgtacatga 1020 
tcctgacaag aagaaaatga agctcaaagt caaaaaatct cgtgaaaaac ggagtttggc 1080 
ctctcatctc agtggatata tccctcccaa aaggaaacaa gggcaaggct tatctttgtg 1140 
tcaaaacgga gagtcaccca actgtgtgga agacaagatg ctctcgacag ttgcagtact 1200 
tacccttggc taa 1213 



<210> 5 
<211> 34 
<212> ADN 
<213> Human 

<400> SEQ ID No: 5 

cggggtacca aaaatgcgca gcagcggcgc cgac 34 



<210> 6 
<211> 21 
<212> ADN 
<213> Human 

<400> SEQ ID No: 6 

tccttccctg actgcagcgc c 21 



<210> 7 
<211> 20 
<212> ADN 
<213> Human 

<400> SEQ ID No: 7 
tgcacaggtg cgcgtcaagc 



<210> 8 
<211> 20 
<212> ADN 
<213> Human 

<400> SEQ ID No: 8 

cacaaacaaa tggtctctcc 20 



<210> 9 
<211> 30 
<212> ADN 
<213> Human 



<400> SEQ ID No: 9 

cggtctagat tagccaaggg taagtactgc 



<210> 10 
<211> 30 
<212> ADN 
<213> Human 



<400> SEQ ID No: 10 

cctcccgggg ccaagggtaa gtactgcaac 



30 



JC08 Rec'd PCT/PTO 0 8 MAY 200f 

SEQUENCE LISTING 

<liO> Hoechst Marion Roussel 

<120> Human htFIIIA gene and coded htfUIA protein 

<130> 9823seq 

<140> 
<141> 

<160> 10 

<170> Patentln Vers. 2.0 

<210> 1 
<211> 1273 
<212> DNA 
<213> Human 

<220> 
<221> CDS 

<222> (176) . . (1270) 
<400> 1 

atgcgcagca gcggcgccga cgcggggcgg tgcctggtga ccgcgcgcgc tcccggaagt 60 

gtgccggcgt cgcgcgaagg ttcagcaggg agccgtgggc cgggcgcgcc ggttcccggc 120 

acgtgtctcg gcacgtggca gcgcgcctgg ccctgggctt ggaggcgccg gcgcc ctg 178 

Met 
1 

gar ccg ccg gcc gtg gtc gcc gag teg gtg teg tec ttg acc ate gec 226 
Asp Pro Pro Ala Val Val Ala Glu Ser Val Ser Ser Leu Thr He Ala 
5 10 15 

gac gcg ttc att gca gcc ggc gag age tea get ccg acc ccg ccg cgc 274 
Asp Ala Phe He Ala Ala Gly Glu Ser Ser Ala Pro Thr Pro Pro Arg 

20 25 30 1 

ccc gcg ctt ccc agg agg ttc ate tgc tec ttc cct gac tgc age gcc 322 
Pro Ala Leu Pro Arg Arg Phe He Cys Ser Phe Pro Asp Cys Ser Ala 
35 40 45 

aat tac age aaa gcc tgg aag ctt gac gcg cac ctg tgc aag cac acg 370 
Asn Tyr Ser Lys Ala Trp Lys Leu Asp Ala His Leu Cys Lys His Thr 
50 55 60 65 

ggg gag aga cca ttt gtt tgt gac tat gaa ggg tgt ggc aag gcc ttc 418 
Gly Glu Arg Pro Phe Val Cys Asp Tyr Glu Gly Cys Gly Lys Ala Phe 
70 75 80 

ate agg gac tac cat ctg age cgc cac att ctg act cac aca gga gaa 466 
He Arg Asp Tyr His Leu Ser Arg His He Leu Thr His Thr Gly Glu 
85 90 95 

aag ccg ttt gtt tgt gca gcc act ggc tgt gat caa aaa ttc aac aca 514 
Lys Pro Phe Val Cys Ala Ala Thr Gly Cys Asp Gin Lys Phe Asn Thr 
100 105 110 



2 



aaa tea aac ttg aag aaa cat ttt gaa cgc aaa cat gaa aat caa caa 562 
Lys Ser Asn Leu Lys Lys His Phe Glu Arg Lys His Glu Asn Gin Gin 
115 120 125 

aaa caa tat ata tgc agt ttt gaa gac tgt aag aag acc ttt aag aaa 610 
Lys Gin Tyr lie Cys Ser Phe Glu Asp Cys Lys Lys Thr Phe Lys Lys 
130 135 140 145 

cat cag cag ctg aaa ate. cat cag tgc cag cat acc aat gaa cct eta 658 
His Gin Gin Leu Lys lie His Gin Cys Gin His Thr Asn Glu Pro Leu 
150 155 160 

ttc aag tgt acc cag gaa gga tgt ggg aaa cac ttt gca tea ccc age 706 
Phe Lys Cys Thr Gin Glu Gly Cys Gly Lys His Phe Ala Ser Pro Ser 
165 170 175 

aag ctg aaa cga cat gee aag gec cac gag ggc tat gta tgt caa aaa 754 
Lys Leu Lys Arg His Ala Lys Ala His Glu Gly Tyr Val Cys Gin Lys 
180 185 190 

gga tgt tec ttt gtg gca aaa aca tgg acg gaa ctt ctg aaa cat gtg 802 
Gly Cys Ser Phe Val Ala Lys Thr Trp Thr Glu Leu Leu Lys His Val 
195 200 205 

aga gaa acc cat aaa gag gaa ata eta tgt gaa gta tgc egg aaa aca 850 
Arg Glu Thr His Lys Glu Glu lie Leu Cys Glu Val Cys Arg Lys Thr 
210 215 220 225 

ttt aaa cgc aaa gat tac ctt aag caa cac atg aaa act cat gec cca 898 
Phe Lys Arg Lys Asp Tyr Leu Lys Gin His Met Lys Thr His Ala Pro 
230 235 240 

gaa agg gat gta tgt cgc tgt cca aga gaa ggc tgt gga aga acc tat 94 6 
Glu Arg Asp Val Cys Arg Cys Pro Arg Glu Gly Cys Gly Arg Thr Tyr 
245 250 255 

act act gtg ttt aat etc caa age cat ate etc tec ttc cat gag gaa 994 
Thr Thr Val Phe Asn Leu Gin Ser His He Leu Ser Phe His Glu Glu 
260 265 270 

age cgc cct ttt gtg tgt gaa cat get ggc tgt ggc aaa aca ttt gca 1042 
Ser Arg Pro Phe Val Cys Glu His Ala Gly Cys Gly Lys Thr Phe Ala 

275 280 285 s 

atg aaa caa agt etc act agg cat get gtt gta cat gat cct gac aag 1090 
Met Lys Gin Ser Leu Thr Arg His Ala Val Val His Asp Pro Aso Lys 
290 295 300 ' 305 

aag aaa atg aag etc aaa gtc aaa aaa tct cgt gaa aaa egg aat ttg 1138 
Lys Lys Met Lys Leu Lys Val Lys Lys Ser Arg Glu Lys Arg Ser Leu 
310 315 320 

gec tct cat etc agt gga tat ate cct ccc aaa agg aaa caa ggg caa 1186 
Ala Ser His Leu Ser Gly Tyr He Pro Pro Lys Arg Lys Gin Gly Gin 
325 330 335 

ggc tta tct ttg tgt caa aac gga gag tea ccc aac tgt gtg gaa gac 1234 
Gly Leu Ser Leu Cys Gin Asn Gly Glu Ser Pro Asn Cys Val Glu Asp 
340 345 350 

aag atg etc teg aca gtt gca gta ctt acc ctt ggc taa 1273 
Lys Met Leu Ser Thr Val Ala Val Leu Thr Leu Gly 
355 360 365 



<210> 2 



3 



<211> 365 
<212> PRT 
<213> Human 

<400> 2 

Met Asp Pro Pro Ala Val Val Ala Glu Ser Val Ser Ser Leu Thr lie 
15 10 15 

Ala Asp Ala Phe He Ala Ala Gly Glu Ser Ser Ala Pro Thr Pro Pro 
20 25 30 

Arg Pro Ala Leu Pro Arg Arg Phe He Cys Ser Phe Pro Asp Cys Ser 
35 40 45 

Ala Asn Tyr Ser Lys Ala Trp Lys Leu Asp Ala His Leu Cys Lys His 
50 55 60 

Thr Gly Glu Arg Pro Phe Val Cys Asp Tyr Glu Gly Cys Gly Lys Ala 
65 70 75 80 

Pne He Arg Asp Tyr His Leu Ser Arg His He Leu Thr His Thr Gly 
85 90 95 

Glu Lys Pro Phe Val Cys Ala Ala Thr Gly Cys Asp Gin Lys Phe Asn 
100 105 110 

Thr Lys Ser Asn Leu Lys Lys His Phe Glu Arg Lys His Glu Asn Gin 
115 120 125 

Gin Lys Gin Tyr lie Cys Ser Phe Glu Asp Cys Lys Lys Thr Phe Lys 
130 135 140 

Lys His Gin Gin Leu Lys lie His Gin Cys Gin His Thr Asn Glu Pro 
145 150 155 160 

Leu Phe Lys Cys Thr Gin Glu Gly Cys Gly Lys His Phe Ala Ser Pro 
165 170 175 

Ser Lys Leu Lys Arg His Ala Lys Ala His Glu Gly Tyr Val Cys Gin 
180 185 190 

Lys Gly Cys Ser Phe Val Ala Lys Thr Trp Thr Glu Leu Leu Lys His 

195 200 205 } 

Val Arg Glu Thr His Lys Glu Glu He Leu Cys Glu Val Cys Arg Lys 
210 215 220 

Thr Phe Lys Arg Lys Asp Tyr Leu Lys Gin His Met Lys Thr His Ala 
225 230 235 240 

Pro Glu Arg Asp Val Cys Arg Cys Pro Arg Glu Gly Cys Gly Arg Thr 
245 250 255 

Tyr Thr Thr Val Phe Asn Leu Gin Ser His lie Leu Ser Phe His Glu 
260 265 270 

Glu Ser Arg Pro Phe Val Cys Glu His Ala Gly Cys Gly Lys Thr Phe 
275 280 285 

Ala Met Lys Gin Ser Leu Thr Arg His Ala Val Val His Asp Pro Asp 
290 295 300 

Lys Lys Lys Met Lys Leu Lys Val Lys Lys Ser Arg Glu Lys Arg Ser 
305 310 315 320 

Leu Ala Ser His Leu Ser Gly Tyr He Pro Pro Lys Arg Lys Gin Gly 



325 330 335 

Gin Gly Leu Ser Leu Cys Gin Asn Gly Glu Ser Pro Asn Cys Val Glu 
340 345 350 

Asp Lys Met Leu Ser Thr Val Ala Val Leu Thr Leu Gly 
355 360 365 



<210> 3 
<211> 1273 
<212> DNA 
<213> Human 

<400> 3 

atgcgcagca gcggcgccga cgcggggcgg tgcctggtga ccgcgcgcgc tcccggaagt 60 
gtgccggcgt cgcgcgaagg ttcagcaggg agccgtgggc cgggcgcgcc ggttcccggc 120 

acgtgtctcg gcacgtggca gcgcgcctgg ccctgggctt ggaggcgccg gcgccctgga 180 

tccgccggcc gtggtcgccg agtcggtgtc gtccttgacc atcgccgacg cgttcattgc 240 

agccggcgag agctcagctc cgaccccgcc gcgccccgcg cttcccagga ggttcatctg 300 

ctccttccct gactgcagcg ccaattacag caaagcctgg aagcttgacg cgcacctgtg 360 

caagcacacg ggggagagac catttgtttg tgactatgaa gggtgtggca aggccttcat 420 

cagggacrac catctgagcc gccacattct gactcacaca ggagaaaagc cgtttgtttg 480 

tgcagccact ggctgtgatc aaaaattcaa cacaaaatca aacrtgaaga aacattttga 540 

acgcaaacat gaaaatcaac aaaaacaata tatatgcagt tttgaagact graagaagac 600 

ctttaagaaa catcagcagc tgaaaatcca tcagtgccag cataccaatg aacctctatt 660 

caagtgtacc caggaaggar gtgggaaaca ctttgcatca cccagcaagc tgaaacgaca 720 

tgccaaggcc cacgagggct atgtatgtca aaaaggatgt tcctttgtgg caaaaacatg 780 

gacggaactt ctgaaacatg tgagagaaac ccataaagag gaaatactat gtgaagtatg 840 

ccggaaaaca tttaaacgca aagattacct taagcaacac atgaaaactc atgccccaga' 900 

aagggatgta tgtcgctgtc caagagaagg ctgtggaaga acctatacta ctgtgtttaa 960 

tctccaaagc catatcctct ccttccatga ggaaagccgc ccttttgtgt gtgaacatgc 1020 

tggctgtggc aaaacatttg caatgaaaca aagtctcact aggcatgctg ttgtacatga 1080 

tcctgacaag aagaaaatga agctcaaagt caaaaaatct cgtgaaaaac ggagtttggc 1140 

ctctcatctc agtggatata tccctcccaa aaggaaacaa gggcaaggct tatctttgtg 1200 

tcaaaacgga gagtcaccca actgtgtgga agacaagatg ctctcgacag ttgcagtact 1260 
tacccttggc taa 1273 

<210> 4 
<211> 1213 
<212> DNA 
<213> Human 

<400> 4 



gtgccggcgc cgcgcgaagg ttcagcaggg 
acgtgtctcg gcacgtggca gcgcgcctgg 
tccgccggcc gtggtcgccg agtcggtgtc 
agccggcgag agctcagctc cgaccccgcc 
ctccttccct gactgcagcg ccaattacag 
caagcacacg ggggagagac catttgtttg 
cagggactac catctgagcc gccacattct 
tgcagccact ggctgtgatc aaaaattcaa 
acgcaaacat gaaaatcaac aaaaacaata 
ctttaagaaa catcagcagc tgaaaatcca 
caagtgtacc caggaaggat gtgggaaaca 
tgccaaggcc cacgagggct atgtatgtca 
gacggaactt ctgaaacatg tgagagaaac 
ccggaaaaca ttraaacgca aagattacct 
aagggatgta tgtcgctgtc caagagaagg 
tctccaaagc catatcctct ccttccatga 
tggctgtggc aaaacatttg caatgaaaca 
tcctgacaag aagaaaatga agctcaaagt 
ctctcatctc agtggatata tccctcccaa 
tcaaaacgga gagtcaccca actgtgtgga 
tacccttggc taa 

<210> 5 
<211> 34 
<212> DNA 
<213> Human 

<400> 5 

cggggtacca aaaatgcgca gcagcggcgc 

<210> 6 

<211> 21 

<212> DNA 

<213> Human 

<400> 6 

tccttccctg actgcagcgc c 

<210> 7 
<211> 20 
<212> DNA 
<213> Human 



agccgtgggc cgggcgcgcc ggttcccggc 60 
ccctgggctt ggaggcgccg gcgccctgga 120 
gtccttgacc atcgccgacg cgttcattgc 180 
gcgccccgcg cttcccagga ggrtcatctg 240 
caaagcctgg aagcttgacg cgcacctgtg 300 
tgactatgaa gggtgtggca aggccttcat 360 
gactcacaca ggagaaaagc cgtttgttrg 420 
cacaaaatca aacttgaaga aacattttga 480 
tatargcagt tttgaagact gtaagaagac 540 
tcagtgccag cataccaatg aacctctatt 600 
ctttgcatca cccagcaagc tgaaacgaca 660 
aaaaggatgt tcctttgtgg caaaaacatg 720 
ccataaagag gaaatactat gtgaagtatg 780 
taagcaacac atgaaaactc atgccccaga 840 
ctgtggaaga acctatacta ctgtgtttaa 900 
ggaaagccgc ccttttgtgt gtgaacatgc 960 
aagtctcact aggcatgctg ttgtacatga 1020 
caaaaaatct cgtgaaaaac ggagtttggc 1080 
aaggaaacaa gggcaaggct tatctttgtg 1140 
agacaagatg ctctcgacag ttgcagtact 1200 

1213 

! 

cgac 34 



<400> 7 

tgcacaggtg cgcgtcaagc 



<210> 8 
<211> 20 
<212> DNA 
<213> Human 

<400> 8 

cacaaacaaa tggtctcrcc 



<210> 9 
<211> 30 
<212> DNA 
<213> Human 

<400> 9 

cggtctagat tagccaaggg taagtactgc 

<210> 10 
<211> 30 
<212> DNA 
<213> Human 

<400> 10 

cctcccgggg ccaagggtaa gtactgcaac 
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Attorney Docket Number 


146.1364 ^\ 


DECLARATION FOR 


First Named Inventor 


B0RD0N-PALLIER et al 


UTILITY OR DESIGN 

DATCMT A DDI \C* A TIHM 

rA 1 fcN 1 APrLILfA f JUN 


COMPLETE IF KNO WN 


Application Number 


PCT/FR99/02738 


Filing Date 


11/9/99 


Declaration OR j— j Declaration 


Group Art Unit 




Submitted Submitted after 






with Initial Filing initial Filing 


Examiner Name 


J 



As a below named Inventor, I hereby declare that: 

My residence, post office address, and citizenship are as stated below next to my name. 

I believe 1 am the original, first and sole inventor (if onfy one name is listed below) or an original, first and joint inventor ("rf plural n 
below) of the subject natter when is claimed and foe which a patent is sought on the invention entitled : 



HUMAN htFIII GENE AND CODED htf IIIA- PROTEIN 



the speciftcatton of which 

is attached hereto 
OR 

03 was filed on (MM/DO/YYYY) 



(Tftfe of trie Invention) 



Nov. 9, 1999 



as Unrted Slates Application Number or PCT Intematbnal 



Application Number PCT/FR99/02738 and was amended on (MM/DO/YYYY) 



i hereby state lhat I have reviewed and understand the contents of the above identified specification, including the ctairns. as amended by any 
amendment specifically referred to above. 

=' I acknowledge the duty to disclose ^formation which is material to patentability as defined in Tile 37 Code of Federal Regulations. § t .56. 



J I hereby claim foreign priority benefits under Tile 35 United States Code §119 (a)-fd) or §365(b) of any foreign application^) for patent or inventor's 
-I certificate, or §365 (a) of any PCT internal boat appBcation which designated at least one country other than the United States of America, listed 
-A below and have also identified below, by checking the box. any foreign application for patent or inventor's certificate, or of any PCT international 
'1 appScafion having a filing date before that of the application on wheh priority is claimed. 



"Prior Foreign Application 
Number's) 


Country 


Foreign Filing Date 
(MM/DO/YYYY) 


Priority 
Not Claimed 


Certified Copy Attached? 
YES NO 


98/14146 


France 


11/10/98 


□ 


□ 


□ 


PCT/FR99/02738 


France '■ 


.1/9/99 


□ 


□ 


□ 






□ 


□ 


□ 








□ 


□ 


a 








□ 


□ 


□ 








□ 


□ 


a 



1 .. j Additional foreign application numbers are listed on a supplemental priority sheet attached hereto: 



I hereby claim the benefit under Tille 35. United States Code§ 1 19(e) of any United Slates provisional application^) listed below. 



Application Number(s) 



Filing Date (MM/DD/YYYYJ 



□ 



Additional provisional application 
numbers are listed on a 
supplemental priority sheet 
attached hereto. 
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(January 1 997) 
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DECLARATION 



( hereby dam the benefit under Title 35. United Stales Code §120 of any United States appicationfs), or §365(c) of any PCT international application 
designating (ha Unrted States of America. Sstcd below and. insofar as the subject matter of each of the claims of this application is not decbsed in the 
prior Unrted States or PCT international application in the manner provided by the first paragraph of Tile 35. United States Code §112. I 
acknowledge the duty to disclose information which is material to patentability as defined <n Title 37, Code of Federal Regulations §156 which 
became available between the filing date of the prior appficabon and the national or PCT international filing date of this applcatwn. 



U.S. Parent Application 
Number 



PCT Parent 
Number 



Parent Filing Date 
(MM/DO/YYYY) 



Parent Patent Number 
(if applicable) 



Q Addtional U.S. or PCT atemational application numbers are listed on a supplemental priority sheet attached hereto. 



As a named inventor. I hereby appoitt the fotowiig registered pradilionerfs) to pnosecuta this applicaf bn and to transact all business in the Patent 
and Trademark Office connected therewith: 



Charles A. Muserlian 
Jordan B. Bierman 
Donald C. Lucas 
Bierman, Muserlian and 
Lucas 



19,683 
18,629 



31^275 



Registration 
Number 



Q Additional registered practitioners) named on a supplemental sheet attached hereto. 



Direct all correspondence to: 



Name 



City 



Country 



bierman, Muserlian and Lucas, 



600 Third Avenue 



New York 



i State 1 New York i ZIP j _10 016 



U.S.A. 



jTetephone j (212) SSl^OOO" [ Fax j (212) 661-8002 



1 hereby declare that alt statements made herein of my own knowledge are true and that all statements made on information and belief are believed lo 
be true: and further lhat these statements were made with the knowledge ihat wilful false statements and the like so made are punishable by fine or 
imprisonment, or both, under Section 1001 of Title 18 of the United States Code and that such wilful false statements may jeopardize the validity of 
the application orany patent issued thereon 



Name of Sole or First Inventor: 



i~{ A petition has been filed for this unsigned inventor 



Given 
Name 



FLORE NCE 



Middle 




Family 




Suffix 


Initial 




Name 


WTRnrN-PATT.TER 


e.g. Jr. 



Inventor's 
Signature 



Residence: City 



Guyancourt 



State Country France 



Post Office Address 



fnx 



Citizenship pj^ 



Post Office Address 



37, boulevard Beethoven 



Guyancourt 



F-78280 



France 



| Additional inventors are being named on supplemental sheet(s) attached hereto 
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DECLARATION 



ADDITIONAL HWENTOR(S) 
Supplemental Sheet 



3 




Post Office Addres: 



3, rue Elisa Lemonnier 



City 



Paris 



Name of Additional Joint Inventor, if 



country fxance 



Given 
Name 



any: 



Inventor's 
Signature 



( j A petition has been filed for this unsigned inventor 



Suffix 

e.g. Jr. 



Residence: 
City 



Post Office Address 



Post Office Addres: 



Name of Additional Joint inventor, if any: 



Given 
Name 



Inventors 
Signature 



f I A petition has been filed for this unsigned inventor 



Family 

, I Name _ 



Residence: 
City 



Post Office Address 



Post Office Address 



Suffix I 

1 e.g. Jr.! 



City 



Name of Additional Joint Inventor, if any: 



Given 
Name 



Inventor's 
Signature 



Middle 




Famlty 


Initial 




Name 



|~j A petition has been filed for this unsigned inventor 



Residence: 
City 



Post Office Addres: 



Post Office Address 



Suffix 



Country 



Q Additi onai inventors are being named on supplemental sheet(s) attached hereto 
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